2006-01-16 06:35:34

by Serge E. Hallyn

[permalink] [raw]
Subject: 2.6.15-mm4 failure on power5

On my power5 partition, 2.6.15-mm4 hangs on boot with the following
console output. 2.6.15-mm3 booted fine. .config attached.

boot: quicktest
Please wait, loading kernel...
Elf64 kernel loaded...
OF stdout device is: /vdevice/vty@30000000
Hypertas detected, assuming LPAR !
command line: ro console=hvc0 root=/dev/sda6 smt-enabled=1
memory layout at init:
memory_limit : 0000000000000000 (16 MB aligned)
alloc_bottom : 0000000002223000
alloc_top : 0000000008000000
alloc_top_hi : 0000000088000000
rmo_top : 0000000008000000
ram_top : 0000000088000000
Looking for displays
instantiating rtas at 0x00000000077d7000 ... done
0000000000000000 : boot cpu 0000000000000000
0000000000000002 : starting cpu hw idx 0000000000000002... done
0000000000000004 : starting cpu hw idx 0000000000000004... done
0000000000000006 : starting cpu hw idx 0000000000000006... done
copying OF device tree ...
Building dt strings...
Building dt structure...
Device tree strings 0x0000000002424000 -> 0x0000000002424f36
Device tree struct 0x0000000002425000 -> 0x000000000242c000
Calling quiesce ...
returning from prom_init
Page orders: linear mapping = 24, others = 12

thanks,
-serge


Attachments:
(No filename) (1.17 kB)
.config (23.82 kB)
Download all attachments

2006-01-16 07:06:24

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.15-mm4 failure on power5

"Serge E. Hallyn" <[email protected]> wrote:
>
> On my power5 partition, 2.6.15-mm4 hangs on boot with the following
> console output. 2.6.15-mm3 booted fine. .config attached.
>
> boot: quicktest
> Please wait, loading kernel...
> Elf64 kernel loaded...
> OF stdout device is: /vdevice/vty@30000000
> Hypertas detected, assuming LPAR !
> command line: ro console=hvc0 root=/dev/sda6 smt-enabled=1
> memory layout at init:
> memory_limit : 0000000000000000 (16 MB aligned)
> alloc_bottom : 0000000002223000
> alloc_top : 0000000008000000
> alloc_top_hi : 0000000088000000
> rmo_top : 0000000008000000
> ram_top : 0000000088000000
> Looking for displays
> instantiating rtas at 0x00000000077d7000 ... done
> 0000000000000000 : boot cpu 0000000000000000
> 0000000000000002 : starting cpu hw idx 0000000000000002... done
> 0000000000000004 : starting cpu hw idx 0000000000000004... done
> 0000000000000006 : starting cpu hw idx 0000000000000006... done
> copying OF device tree ...
> Building dt strings...
> Building dt structure...
> Device tree strings 0x0000000002424000 -> 0x0000000002424f36
> Device tree struct 0x0000000002425000 -> 0x000000000242c000
> Calling quiesce ...
> returning from prom_init
> Page orders: linear mapping = 24, others = 12

It might be worth reverting the changes to arch/powerpc/mm/hash_utils_64.c,
see if that unbreaks it.

- base = lmb.memory.region[i].base + KERNELBASE;
+ base = (unsigned long)__va(lmb.memory.region[i].base);

The nice comment in page.h:

* KERNELBASE is the virtual address of the start of the kernel, it's often
* the same as PAGE_OFFSET, but _might not be_.
*
* The kdump dump kernel is one example where KERNELBASE != PAGE_OFFSET.
*
* To get a physical address from a virtual one you subtract PAGE_OFFSET,
* _not_ KERNELBASE.

Tells us that was not an equivalent transformation.

2006-01-16 13:01:12

by Michael Ellerman

[permalink] [raw]
Subject: Re: 2.6.15-mm4 failure on power5

On Mon, 16 Jan 2006 18:05, Andrew Morton wrote:
> "Serge E. Hallyn" <[email protected]> wrote:
> > On my power5 partition, 2.6.15-mm4 hangs on boot
>
> It might be worth reverting the changes to arch/powerpc/mm/hash_utils_64.c,
> see if that unbreaks it.
>
> - base = lmb.memory.region[i].base + KERNELBASE;
> + base = (unsigned long)__va(lmb.memory.region[i].base);

You can try it, but if that fixes the problem I'll buy a sombrero and then eat
it.

> The nice comment in page.h:
>
> * KERNELBASE is the virtual address of the start of the kernel, it's often
> * the same as PAGE_OFFSET, but _might not be_.
> *
> * The kdump dump kernel is one example where KERNELBASE != PAGE_OFFSET.
> *
> * To get a physical address from a virtual one you subtract PAGE_OFFSET,
> * _not_ KERNELBASE.
>
> Tells us that was not an equivalent transformation.

True, not equivalent in all cases, but correct. For non-kdump kernels (which I
assume this is) KERNELBASE == PAGE_OFFSET, and for a kdump kernel that code
wants to use PAGE_OFFSET, not KERNELBASE.

Try enabling early debugging (see arch/powerpc/kernel/setup_64.c) and then
turning on DEBUG in hash_utils_64.c, setup_64.c etc.

cheers

--
Michael Ellerman
IBM OzLabs

email: michael:ellerman.id.au
inmsg: mpe:jabber.org
wwweb: http://michael.ellerman.id.au
phone: +61 2 6212 1183 (tie line 70 21183)

We do not inherit the earth from our ancestors,
we borrow it from our children. - S.M.A.R.T Person


Attachments:
(No filename) (1.42 kB)
(No filename) (189.00 B)
Download all attachments

2006-01-16 21:52:59

by Dave C Boutcher

[permalink] [raw]
Subject: Re: 2.6.15-mm4 failure on power5

On Mon, Jan 16, 2006 at 09:37:48AM -0600, Serge E. Hallyn wrote:
> Quoting Michael Ellerman ([email protected]):
> > On Mon, 16 Jan 2006 18:05, Andrew Morton wrote:
> > > "Serge E. Hallyn" <[email protected]> wrote:
> > > > On my power5 partition, 2.6.15-mm4 hangs on boot
>
> boot: quicktest
> Please wait, loading kernel...

...

> Page orders: linear mapping = 24, others = 12
> -> smp_release_cpus()
> <- smp_release_cpus()
> <- setup_system()
>
> So setup_system() at least finishes, though I don't see the
> printk's at the bottom of that function.

2.6.15-mm4 won't boot on my power5 either. I tracked it down to the
following mutex patch from Ingo: kernel-kernel-cpuc-to-mutexes.patch

If I revert just that patch, mm4 boots fine. Its really not obvious to
me at all why that patch is breaking things though...

--
Dave Boutcher

2006-01-17 01:09:50

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.15-mm4 failure on power5

Dave C Boutcher <[email protected]> wrote:
>
> On Mon, Jan 16, 2006 at 09:37:48AM -0600, Serge E. Hallyn wrote:
> > Quoting Michael Ellerman ([email protected]):
> > > On Mon, 16 Jan 2006 18:05, Andrew Morton wrote:
> > > > "Serge E. Hallyn" <[email protected]> wrote:
> > > > > On my power5 partition, 2.6.15-mm4 hangs on boot
> >
> > boot: quicktest
> > Please wait, loading kernel...
>
> ...
>
> > Page orders: linear mapping = 24, others = 12
> > -> smp_release_cpus()
> > <- smp_release_cpus()
> > <- setup_system()
> >
> > So setup_system() at least finishes, though I don't see the
> > printk's at the bottom of that function.
>
> 2.6.15-mm4 won't boot on my power5 either. I tracked it down to the
> following mutex patch from Ingo: kernel-kernel-cpuc-to-mutexes.patch

Thanks for doing that - I know it's a lot of work, but boy it helps.

<mutters something unprintable about mutex patches and work prioritisation>

> If I revert just that patch, mm4 boots fine. Its really not obvious to
> me at all why that patch is breaking things though...
>

Yes, that is strange. I do recall that if something accidentally enables
interrupts too early in boot, ppc64 machines tend to go comatose. But if
we'd been running that code under local_irq_disable(), down() would have
spat a warning.

Drat, it seems I don't have CPU hotplug in my ppc64 config.

2006-01-17 08:17:55

by Ingo Molnar

[permalink] [raw]
Subject: Re: 2.6.15-mm4 failure on power5


* Andrew Morton <[email protected]> wrote:

> > If I revert just that patch, mm4 boots fine. Its really not obvious to
> > me at all why that patch is breaking things though...
>
> Yes, that is strange. I do recall that if something accidentally
> enables interrupts too early in boot, ppc64 machines tend to go
> comatose. But if we'd been running that code under
> local_irq_disable(), down() would have spat a warning.

perhaps it was just luck it worked so far, and the bug could have had
worse incarnations that the current clear hang if a certain generic
codepath is touched in a perfectly valid way. Does CONFIG_DEBUG_MUTEXES
(or any of the other debugging options) make any noise?

Ingo

2006-01-17 08:47:58

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.15-mm4 failure on power5

Ingo Molnar <[email protected]> wrote:
>
>
> * Andrew Morton <[email protected]> wrote:
>
> > > If I revert just that patch, mm4 boots fine. Its really not obvious to
> > > me at all why that patch is breaking things though...
> >
> > Yes, that is strange. I do recall that if something accidentally
> > enables interrupts too early in boot, ppc64 machines tend to go
> > comatose. But if we'd been running that code under
> > local_irq_disable(), down() would have spat a warning.
>
> perhaps it was just luck it worked so far, and the bug could have had
> worse incarnations that the current clear hang if a certain generic
> codepath is touched in a perfectly valid way. Does CONFIG_DEBUG_MUTEXES
> (or any of the other debugging options) make any noise?
>

The bug happens on the G5 too. There's nothing useful on the screen,
nothing on netconsole. Could the people whose machines have a fscking
serial port please try CONFIG_DEBUG_MUTEXES?

2006-01-17 12:22:33

by Serge E. Hallyn

[permalink] [raw]
Subject: Re: 2.6.15-mm4 failure on power5

Quoting Dave C Boutcher ([email protected]):
> On Mon, Jan 16, 2006 at 09:37:48AM -0600, Serge E. Hallyn wrote:
> > Quoting Michael Ellerman ([email protected]):
> > > On Mon, 16 Jan 2006 18:05, Andrew Morton wrote:
> > > > "Serge E. Hallyn" <[email protected]> wrote:
> > > > > On my power5 partition, 2.6.15-mm4 hangs on boot
> >
> > boot: quicktest
> > Please wait, loading kernel...
>
> ...
>
> > Page orders: linear mapping = 24, others = 12
> > -> smp_release_cpus()
> > <- smp_release_cpus()
> > <- setup_system()
> >
> > So setup_system() at least finishes, though I don't see the
> > printk's at the bottom of that function.
>
> 2.6.15-mm4 won't boot on my power5 either. I tracked it down to the
> following mutex patch from Ingo: kernel-kernel-cpuc-to-mutexes.patch
>
> If I revert just that patch, mm4 boots fine. Its really not obvious to
> me at all why that patch is breaking things though...

FWIW this fixes mine as well.

-serge

2006-01-17 13:40:19

by Michael Ellerman

[permalink] [raw]
Subject: Re: 2.6.15-mm4 failure on power5

On Tue, 17 Jan 2006 08:52, Dave C Boutcher wrote:
> 2.6.15-mm4 won't boot on my power5 either. I tracked it down to the
> following mutex patch from Ingo: kernel-kernel-cpuc-to-mutexes.patch
>
> If I revert just that patch, mm4 boots fine. Its really not obvious to
> me at all why that patch is breaking things though...

My POWER5 (gr) LPAR seems to boot ok (3 times so far) with that patch, guess
it's something subtle. That's with CONFIG_DEBUG_MUTEXES=y. And it's just
booted once with CONFIG_DEBUG_MUTEXES=n.

And now it's booted the full mm4 patch set without blinking.

cheers

--
Michael Ellerman
IBM OzLabs

email: michael:ellerman.id.au
inmsg: mpe:jabber.org
wwweb: http://michael.ellerman.id.au
phone: +61 2 6212 1183 (tie line 70 21183)

We do not inherit the earth from our ancestors,
we borrow it from our children. - S.M.A.R.T Person


Attachments:
(No filename) (854.00 B)
(No filename) (189.00 B)
Download all attachments

2006-01-17 14:00:52

by Ingo Molnar

[permalink] [raw]
Subject: Re: 2.6.15-mm4 failure on power5


* Michael Ellerman <[email protected]> wrote:

> On Tue, 17 Jan 2006 08:52, Dave C Boutcher wrote:
> > 2.6.15-mm4 won't boot on my power5 either. I tracked it down to the
> > following mutex patch from Ingo: kernel-kernel-cpuc-to-mutexes.patch
> >
> > If I revert just that patch, mm4 boots fine. Its really not obvious to
> > me at all why that patch is breaking things though...
>
> My POWER5 (gr) LPAR seems to boot ok (3 times so far) with that patch,
> guess it's something subtle. That's with CONFIG_DEBUG_MUTEXES=y. And
> it's just booted once with CONFIG_DEBUG_MUTEXES=n.
>
> And now it's booted the full mm4 patch set without blinking.

so it booted fine with CONFIG_DEBUG_MUTEXES=n but with that patch not
applied?

the patch will likely work around the bug, so DEBUG_MUTEXES=y/n should
make no difference with that patch applied.

Ingo

2006-01-17 16:52:50

by Dave C Boutcher

[permalink] [raw]
Subject: Re: 2.6.15-mm4 failure on power5

On Tue, Jan 17, 2006 at 09:17:49AM +0100, Ingo Molnar wrote:
>
> * Andrew Morton <[email protected]> wrote:
>
> > > If I revert just that patch, mm4 boots fine. Its really not obvious to
> > > me at all why that patch is breaking things though...
> >
> > Yes, that is strange. I do recall that if something accidentally
> > enables interrupts too early in boot, ppc64 machines tend to go
> > comatose. But if we'd been running that code under
> > local_irq_disable(), down() would have spat a warning.
>
> perhaps it was just luck it worked so far, and the bug could have had
> worse incarnations that the current clear hang if a certain generic
> codepath is touched in a perfectly valid way. Does CONFIG_DEBUG_MUTEXES
> (or any of the other debugging options) make any noise?

Well, it turns out that I've been running with CONFIG_DEBUG_MUTEXES all
along...so no noise. My console output is a little different that
Serge's, so I think this is timing related. Also note that I'm dying in
the timer interrupt...

Please wait, loading kernel...
Elf64 kernel loaded...
Loading ramdisk...
ramdisk loaded at 02600000, size: 1212 Kbytes
OF stdout device is: /vdevice/vty@30000000
Hypertas detected, assuming LPAR !
command line: root=/dev/sda3 selinux=0 elevator=cfq
memory layout at init:
memory_limit : 0000000000000000 (16 MB aligned)
alloc_bottom : 000000000272f000
alloc_top : 0000000008000000
alloc_top_hi : 0000000100000000
rmo_top : 0000000008000000
ram_top : 0000000100000000
Looking for displays
found display : /pci@800000020000002/pci@2,6/pci@1/display@0, opening
... doneinstantiating rtas at 0x0000000007734000 ... done
0000000000000000 : boot cpu 0000000000000000
0000000000000002 : starting cpu hw idx 0000000000000002... done
0000000000000004 : starting cpu hw idx 0000000000000004... done
0000000000000006 : starting cpu hw idx 0000000000000006... done
copying OF device tree ...
Building dt strings...
Building dt structure...
Device tree strings 0x0000000002a30000 -> 0x0000000002a313f5
Device tree struct 0x0000000002a32000 -> 0x0000000002a42000
Calling quiesce ...
returning from prom_init
Page orders: linear mapping = 24, others = 12
Found initrd at 0xc000000002600000:0xc00000000272f000
cpu 0x0: Vector: 300 (Data Access) at [c000000000577520]
pc: c000000000021064: .timer_interrupt+0xf4/0x440
lr: c000000000021020: .timer_interrupt+0xb0/0x440
sp: c0000000005777a0
msr: 8000000000001032
dar: 10
dsisr: 40000000
current = 0xc0000000005c1150
paca = 0xc0000000005c1d00
pid = 0, comm = swapper
enter ? for help
0:mon>


--
Dave Boutcher

2006-01-17 16:55:59

by Dave C Boutcher

[permalink] [raw]
Subject: Re: 2.6.15-mm4 failure on power5

On Tue, Jan 17, 2006 at 10:52:44AM -0600, Dave C Boutcher wrote:
> Well, it turns out that I've been running with CONFIG_DEBUG_MUTEXES all
> along...so no noise. My console output is a little different that
> Serge's, so I think this is timing related. Also note that I'm dying in
> the timer interrupt...

duh... here's the backtrace
0:mon> t
[c000000000577890] c0000000000034b4 decrementer_common+0xb4/0x100
--- Exception: 901 (Decrementer) at c0000000004627ec
.__mutex_lock_interruptible_slowpath+0x3bc/0x4c4
[c000000000577c60] c000000000075064 .__lock_cpu_hotplug+0x44/0xa8
[c000000000577ce0] c000000000075600 .register_cpu_notifier+0x24/0x68
[c000000000577d70] c00000000052cd7c .do_init_bootmem+0x68c/0xab0
[c000000000577e50] c000000000522c84 .setup_arch+0x21c/0x2c0
[c000000000577ef0] c00000000051a538 .start_kernel+0x40/0x280
[c000000000577f90] c000000000008574 .hmt_init+0x0/0x8c


--
Dave Boutcher

2006-01-18 00:19:57

by Michael Ellerman

[permalink] [raw]
Subject: Re: 2.6.15-mm4 failure on power5

On Wed, 18 Jan 2006 01:00, Ingo Molnar wrote:
> * Michael Ellerman <[email protected]> wrote:
> > On Tue, 17 Jan 2006 08:52, Dave C Boutcher wrote:
> > > 2.6.15-mm4 won't boot on my power5 either. I tracked it down to the
> > > following mutex patch from Ingo: kernel-kernel-cpuc-to-mutexes.patch
> > >
> > > If I revert just that patch, mm4 boots fine. Its really not obvious to
> > > me at all why that patch is breaking things though...
> >
> > My POWER5 (gr) LPAR seems to boot ok (3 times so far) with that patch,
> > guess it's something subtle. That's with CONFIG_DEBUG_MUTEXES=y. And
> > it's just booted once with CONFIG_DEBUG_MUTEXES=n.
> >
> > And now it's booted the full mm4 patch set without blinking.
>
> so it booted fine with CONFIG_DEBUG_MUTEXES=n but with that patch not
> applied?
>
> the patch will likely work around the bug, so DEBUG_MUTEXES=y/n should
> make no difference with that patch applied.

It booted fine _with_ the patch applied, with DEBUG_MUTEXES=y and n.

Boutcher, to be clear, you can't boot with kernel-kernel-cpuc-to-mutexes.patch
applied and DEBUG_MUTEXES=y ?

But if you revert kernel-kernel-cpuc-to-mutexes.patch it boots ok?

This is looking quite similar to another hang we're seeing on Power4 iSeries
on mainline git:
http://ozlabs.org/pipermail/linuxppc64-dev/2006-January/007679.html

cheers

--
Michael Ellerman
IBM OzLabs

email: michael:ellerman.id.au
inmsg: mpe:jabber.org
wwweb: http://michael.ellerman.id.au
phone: +61 2 6212 1183 (tie line 70 21183)

We do not inherit the earth from our ancestors,
we borrow it from our children. - S.M.A.R.T Person


Attachments:
(No filename) (1.58 kB)
(No filename) (189.00 B)
Download all attachments

2006-01-18 03:32:45

by Dave C Boutcher

[permalink] [raw]
Subject: Re: 2.6.15-mm4 failure on power5

On Wed, Jan 18, 2006 at 11:19:36AM +1100, Michael Ellerman wrote:
> It booted fine _with_ the patch applied, with DEBUG_MUTEXES=y and n.
>
> Boutcher, to be clear, you can't boot with kernel-kernel-cpuc-to-mutexes.patch
> applied and DEBUG_MUTEXES=y ?
>
> But if you revert kernel-kernel-cpuc-to-mutexes.patch it boots ok?
>
> This is looking quite similar to another hang we're seeing on Power4 iSeries
> on mainline git:
> http://ozlabs.org/pipermail/linuxppc64-dev/2006-January/007679.html

Correct...I die in exactly the same place every time with
DEBUG_MUTEXES=Y. I posted a backtrace that points into the _lock_cpu
code, but I haven't really dug into the issue yet. I believe this is
very timing related (Serge was dying slightly differently).

--
Dave Boutcher

2006-01-18 06:37:36

by Ingo Molnar

[permalink] [raw]
Subject: Re: 2.6.15-mm4 failure on power5


* Dave C Boutcher <[email protected]> wrote:

> On Wed, Jan 18, 2006 at 11:19:36AM +1100, Michael Ellerman wrote:
> > It booted fine _with_ the patch applied, with DEBUG_MUTEXES=y and n.
> >
> > Boutcher, to be clear, you can't boot with kernel-kernel-cpuc-to-mutexes.patch
> > applied and DEBUG_MUTEXES=y ?
> >
> > But if you revert kernel-kernel-cpuc-to-mutexes.patch it boots ok?
> >
> > This is looking quite similar to another hang we're seeing on Power4 iSeries
> > on mainline git:
> > http://ozlabs.org/pipermail/linuxppc64-dev/2006-January/007679.html
>
> Correct...I die in exactly the same place every time with
> DEBUG_MUTEXES=Y. I posted a backtrace that points into the _lock_cpu
> code, but I haven't really dug into the issue yet. I believe this is
> very timing related (Serge was dying slightly differently).

so my question still is: _without_ the workaround patch, i.e. with
vanilla -mm4, and DEBUG_MUTEXES=n, do you get a hang?

the reason for my question is that DEBUG_MUTEXES=y will e.g. enable
interrupts - so buggy early bootup code which relies on interrupts being
off might be surprised by it. The fact that you observed that it's
somehow related to the timer interrupt seems to strengthen this
suspicion. DEBUG_MUTEXES=n on the other hand should have no such
interrupt-enabling effects.

[ if this indeed is the case then i'll add irqs_off() checks to
DEBUG_MUTEXES=y, to ensure that the mutex APIs are never called with
interrupts disabled. ]

Ingo

2006-01-18 06:41:01

by Nathan Lynch

[permalink] [raw]
Subject: Re: 2.6.15-mm4 failure on power5

Dave C Boutcher wrote:
> On Tue, Jan 17, 2006 at 10:52:44AM -0600, Dave C Boutcher wrote:
> > Well, it turns out that I've been running with CONFIG_DEBUG_MUTEXES all
> > along...so no noise. My console output is a little different that
> > Serge's, so I think this is timing related. Also note that I'm dying in
> > the timer interrupt...
>
> duh... here's the backtrace
> 0:mon> t
> [c000000000577890] c0000000000034b4 decrementer_common+0xb4/0x100
> --- Exception: 901 (Decrementer) at c0000000004627ec
> .__mutex_lock_interruptible_slowpath+0x3bc/0x4c4
> [c000000000577c60] c000000000075064 .__lock_cpu_hotplug+0x44/0xa8
> [c000000000577ce0] c000000000075600 .register_cpu_notifier+0x24/0x68
> [c000000000577d70] c00000000052cd7c .do_init_bootmem+0x68c/0xab0
> [c000000000577e50] c000000000522c84 .setup_arch+0x21c/0x2c0
> [c000000000577ef0] c00000000051a538 .start_kernel+0x40/0x280
> [c000000000577f90] c000000000008574 .hmt_init+0x0/0x8c

The mutex debug code (debug_spin_unlock in kernel/mutex-debug.h) is
doing a local_irq_enable way before we're ready.

BTW: I couldn't build powerpc without mutex debugging until I changed
the SYNC_ON_SMP in include/asm-powerpc/mutex.h:__mutex_fastpath_unlock
to ISYNC_ON_SMP.

With that change, I was able to boot semi-successfully with mutex
debugging off -- the system got hung up when udev started, apparently
(or maybe I was too impatient).

2006-01-18 06:54:05

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.15-mm4 failure on power5

Ingo Molnar <[email protected]> wrote:
>
>
> * Dave C Boutcher <[email protected]> wrote:
>
> > On Wed, Jan 18, 2006 at 11:19:36AM +1100, Michael Ellerman wrote:
> > > It booted fine _with_ the patch applied, with DEBUG_MUTEXES=y and n.
> > >
> > > Boutcher, to be clear, you can't boot with kernel-kernel-cpuc-to-mutexes.patch
> > > applied and DEBUG_MUTEXES=y ?
> > >
> > > But if you revert kernel-kernel-cpuc-to-mutexes.patch it boots ok?
> > >
> > > This is looking quite similar to another hang we're seeing on Power4 iSeries
> > > on mainline git:
> > > http://ozlabs.org/pipermail/linuxppc64-dev/2006-January/007679.html
> >
> > Correct...I die in exactly the same place every time with
> > DEBUG_MUTEXES=Y. I posted a backtrace that points into the _lock_cpu
> > code, but I haven't really dug into the issue yet. I believe this is
> > very timing related (Serge was dying slightly differently).
>
> so my question still is: _without_ the workaround patch, i.e. with
> vanilla -mm4, and DEBUG_MUTEXES=n, do you get a hang?
>
> the reason for my question is that DEBUG_MUTEXES=y will e.g. enable
> interrupts

That used to kill ppc64 and yes, it died in timer interrupts.

> - so buggy early bootup code which relies on interrupts being
> off might be surprised by it.

I don't think it's necessarily buggy that bootup code needs interrupts
disabled. It _is_ buggy that bootup code which needs interrupts disabled
is calling lock_cpu_hotplug().

> The fact that you observed that it's
> somehow related to the timer interrupt seems to strengthen this
> suspicion. DEBUG_MUTEXES=n on the other hand should have no such
> interrupt-enabling effects.
>
> [ if this indeed is the case then i'll add irqs_off() checks to
> DEBUG_MUTEXES=y, to ensure that the mutex APIs are never called with
> interrupts disabled. ]

Yes, I suppose so. But we're already calling might_sleep(), and
might_sleep() checks for that. Perhaps the might_sleep() check is being
defeated by the nasty system_running check.

There's a sad story behind that system_running check in might_sleep().
Because the kernel early boot is running in an in_atomic() state, a great
number of bogus might_sleep() warnings come out because of various code
doing potentially-sleepy things. I ended up adding the system_running
test, with the changelog "OK, I give up. Kill all the might_sleep warnings
from the early boot process." Undoing that and fixing up the fallout would
be a lot of nasty work.


2006-01-18 07:04:30

by Ingo Molnar

[permalink] [raw]
Subject: Re: 2.6.15-mm4 failure on power5


* Andrew Morton <[email protected]> wrote:

> > [ if this indeed is the case then i'll add irqs_off() checks to
> > DEBUG_MUTEXES=y, to ensure that the mutex APIs are never called with
> > interrupts disabled. ]
>
> Yes, I suppose so. But we're already calling might_sleep(), and
> might_sleep() checks for that. Perhaps the might_sleep() check is
> being defeated by the nasty system_running check.

ah ... indeed.

> There's a sad story behind that system_running check in might_sleep().
> Because the kernel early boot is running in an in_atomic() state, a
> great number of bogus might_sleep() warnings come out because of
> various code doing potentially-sleepy things. I ended up adding the
> system_running test, with the changelog "OK, I give up. Kill all the
> might_sleep warnings from the early boot process." Undoing that and
> fixing up the fallout would be a lot of nasty work.

OTOH, x86 was just fine last i checked, and it has alot more complex
bootup code than any of the other architectures (due to the sheer number
of x86 variants).

Ingo

2006-01-18 07:07:31

by Ingo Molnar

[permalink] [raw]
Subject: Re: 2.6.15-mm4 failure on power5


* Nathan Lynch <[email protected]> wrote:

> Dave C Boutcher wrote:
> > On Tue, Jan 17, 2006 at 10:52:44AM -0600, Dave C Boutcher wrote:
> > > Well, it turns out that I've been running with CONFIG_DEBUG_MUTEXES all
> > > along...so no noise. My console output is a little different that
> > > Serge's, so I think this is timing related. Also note that I'm dying in
> > > the timer interrupt...
> >
> > duh... here's the backtrace
> > 0:mon> t
> > [c000000000577890] c0000000000034b4 decrementer_common+0xb4/0x100
> > --- Exception: 901 (Decrementer) at c0000000004627ec
> > .__mutex_lock_interruptible_slowpath+0x3bc/0x4c4
> > [c000000000577c60] c000000000075064 .__lock_cpu_hotplug+0x44/0xa8
> > [c000000000577ce0] c000000000075600 .register_cpu_notifier+0x24/0x68
> > [c000000000577d70] c00000000052cd7c .do_init_bootmem+0x68c/0xab0
> > [c000000000577e50] c000000000522c84 .setup_arch+0x21c/0x2c0
> > [c000000000577ef0] c00000000051a538 .start_kernel+0x40/0x280
> > [c000000000577f90] c000000000008574 .hmt_init+0x0/0x8c
>
> The mutex debug code (debug_spin_unlock in kernel/mutex-debug.h) is
> doing a local_irq_enable way before we're ready.
>
> BTW: I couldn't build powerpc without mutex debugging until I changed
> the SYNC_ON_SMP in include/asm-powerpc/mutex.h:__mutex_fastpath_unlock
> to ISYNC_ON_SMP.
>
> With that change, I was able to boot semi-successfully with mutex
> debugging off -- the system got hung up when udev started, apparently
> (or maybe I was too impatient).

ugh! Does the patch below get you a working system with DEBUG_MUTEXES=n?

Ingo

--

revert the ppc64 mutex fastpath assembly optimizations for now.

Signed-off-by: Ingo Molnar <[email protected]>

----
include/asm-powerpc/mutex.h | 85 ++------------------------------------------
1 files changed, 5 insertions(+), 80 deletions(-)

Index: linux/include/asm-powerpc/mutex.h
===================================================================
--- linux.orig/include/asm-powerpc/mutex.h
+++ linux/include/asm-powerpc/mutex.h
@@ -1,84 +1,9 @@
/*
- * include/asm-powerpc/mutex.h
+ * Pull in the generic implementation for the mutex fastpath.
*
- * PowerPC optimized mutex locking primitives
- *
- * Please look into asm-generic/mutex-xchg.h for a formal definition.
- * Copyright (C) 2006 Joel Schopp <[email protected]>, IBM
+ * TODO: implement optimized primitives instead, or leave the generic
+ * implementation in place, or pick the atomic_xchg() based generic
+ * implementation. (see asm-generic/mutex-xchg.h for details)
*/
-#ifndef _ASM_MUTEX_H
-#define _ASM_MUTEX_H
-#define __mutex_fastpath_lock(count, fail_fn)\
-do{ \
- int tmp; \
- __asm__ __volatile__( \
-"1: lwarx %0,0,%1\n" \
-" addic %0,%0,-1\n" \
-" stwcx. %0,0,%1\n" \
-" bne- 1b\n" \
- ISYNC_ON_SMP \
- : "=&r" (tmp) \
- : "r" (&(count)->counter) \
- : "cr0", "memory"); \
- if (unlikely(tmp < 0)) \
- fail_fn(count); \
-} while (0)
-
-#define __mutex_fastpath_unlock(count, fail_fn)\
-do{ \
- int tmp; \
- __asm__ __volatile__(SYNC_ON_SMP\
-"1: lwarx %0,0,%1\n" \
-" addic %0,%0,1\n" \
-" stwcx. %0,0,%1\n" \
-" bne- 1b\n" \
- : "=&r" (tmp) \
- : "r" (&(count)->counter) \
- : "cr0", "memory"); \
- if (unlikely(tmp <= 0)) \
- fail_fn(count); \
-} while (0)
-
-
-static inline int
-__mutex_fastpath_trylock(atomic_t* count, int (*fail_fn)(atomic_t*))
-{
- int tmp;
- __asm__ __volatile__(
-"1: lwarx %0,0,%1\n"
-" cmpwi 0,%0,1\n"
-" bne- 2f\n"
-" addic %0,%0,-1\n"
-" stwcx. %0,0,%1\n"
-" bne- 1b\n"
-" isync\n"
-"2:"
- : "=&r" (tmp)
- : "r" (&(count)->counter)
- : "cr0", "memory");
-
- return (int)tmp;
-
-}
-
-#define __mutex_slowpath_needs_to_unlock() 1

-static inline int
-__mutex_fastpath_lock_retval(atomic_t* count, int (*fail_fn)(atomic_t *))
-{
- int tmp;
- __asm__ __volatile__(
-"1: lwarx %0,0,%1\n"
-" addic %0,%0,-1\n"
-" stwcx. %0,0,%1\n"
-" bne- 1b\n"
-" isync \n"
- : "=&r" (tmp)
- : "r" (&(count)->counter)
- : "cr0", "memory");
- if (unlikely(tmp < 0))
- return fail_fn(count);
- else
- return 0;
-}
-#endif
+#include <asm-generic/mutex-dec.h>

2006-01-18 07:28:32

by Nathan Lynch

[permalink] [raw]
Subject: Re: 2.6.15-mm4 failure on power5

Andrew Morton wrote:
> Ingo Molnar <[email protected]> wrote:
> > - so buggy early bootup code which relies on interrupts being
> > off might be surprised by it.
>
> I don't think it's necessarily buggy that bootup code needs interrupts
> disabled. It _is_ buggy that bootup code which needs interrupts disabled
> is calling lock_cpu_hotplug().

I guess I don't understand -- why is it wrong for code that runs only
in early early bootup, when there is only one process context, to use
common code to e.g. register a hotplug cpu notifier? Should the
powerpc numa code be made to wait to register its notifier until
initcall time or something?

> > The fact that you observed that it's
> > somehow related to the timer interrupt seems to strengthen this
> > suspicion. DEBUG_MUTEXES=n on the other hand should have no such
> > interrupt-enabling effects.
> >
> > [ if this indeed is the case then i'll add irqs_off() checks to
> > DEBUG_MUTEXES=y, to ensure that the mutex APIs are never called with
> > interrupts disabled. ]
>
> Yes, I suppose so. But we're already calling might_sleep(), and
> might_sleep() checks for that. Perhaps the might_sleep() check is being
> defeated by the nasty system_running check.

Yes, which would be why this code never triggered a warning when
cpucontrol was a semaphore.

2006-01-18 07:38:15

by Arjan van de Ven

[permalink] [raw]
Subject: Re: 2.6.15-mm4 failure on power5

On Wed, 2006-01-18 at 01:28 -0600, Nathan Lynch wrote:
> Andrew Morton wrote:
> > Ingo Molnar <[email protected]> wrote:
> > > - so buggy early bootup code which relies on interrupts being
> > > off might be surprised by it.
> >
> > I don't think it's necessarily buggy that bootup code needs interrupts
> > disabled. It _is_ buggy that bootup code which needs interrupts disabled
> > is calling lock_cpu_hotplug().
>
> I guess I don't understand -- why is it wrong for code that runs only
> in early early bootup, when there is only one process context, to use
> common code to e.g. register a hotplug cpu notifier?

it's nasty to use things-that-can-sleep there though.
Even if that sleep is a bit theoretical, it still isn't nice.


2006-01-18 07:38:24

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.15-mm4 failure on power5

Nathan Lynch <[email protected]> wrote:
>
> Andrew Morton wrote:
> > Ingo Molnar <[email protected]> wrote:
> > > - so buggy early bootup code which relies on interrupts being
> > > off might be surprised by it.
> >
> > I don't think it's necessarily buggy that bootup code needs interrupts
> > disabled. It _is_ buggy that bootup code which needs interrupts disabled
> > is calling lock_cpu_hotplug().
>
> I guess I don't understand -- why is it wrong for code that runs only
> in early early bootup, when there is only one process context, to use
> common code to e.g. register a hotplug cpu notifier?

OK, it's not wrong I guess - we're running code which requires
local_irq_disable() and that code is calling functions which do
local_irq_enable() but we know that those functions won't do that because
there cannot be any lock contention.

So it works, and will continue to work, but it's all rather unpleasant, IMO.

> Should the
> powerpc numa code be made to wait to register its notifier until
> initcall time or something?

I think the powerpc code is busted, really - it shouldn't be keeling over
like that if someone enables local interrupts. That being said, it's a
good way of detecting accidental interrupt-enablings.

> Yes, which would be why this code never triggered a warning when
> cpucontrol was a semaphore.

Yup. Perhaps a sane fix which preserves the unpleasant semantics is to do
irqsave in the mutex debug code.

2006-01-18 07:53:29

by Nathan Lynch

[permalink] [raw]
Subject: Re: 2.6.15-mm4 failure on power5

Ingo Molnar wrote:
>
> * Nathan Lynch <[email protected]> wrote:
>
> > Dave C Boutcher wrote:
> > > On Tue, Jan 17, 2006 at 10:52:44AM -0600, Dave C Boutcher wrote:
> > > > Well, it turns out that I've been running with CONFIG_DEBUG_MUTEXES all
> > > > along...so no noise. My console output is a little different that
> > > > Serge's, so I think this is timing related. Also note that I'm dying in
> > > > the timer interrupt...
> > >
> > > duh... here's the backtrace
> > > 0:mon> t
> > > [c000000000577890] c0000000000034b4 decrementer_common+0xb4/0x100
> > > --- Exception: 901 (Decrementer) at c0000000004627ec
> > > .__mutex_lock_interruptible_slowpath+0x3bc/0x4c4
> > > [c000000000577c60] c000000000075064 .__lock_cpu_hotplug+0x44/0xa8
> > > [c000000000577ce0] c000000000075600 .register_cpu_notifier+0x24/0x68
> > > [c000000000577d70] c00000000052cd7c .do_init_bootmem+0x68c/0xab0
> > > [c000000000577e50] c000000000522c84 .setup_arch+0x21c/0x2c0
> > > [c000000000577ef0] c00000000051a538 .start_kernel+0x40/0x280
> > > [c000000000577f90] c000000000008574 .hmt_init+0x0/0x8c
> >
> > The mutex debug code (debug_spin_unlock in kernel/mutex-debug.h) is
> > doing a local_irq_enable way before we're ready.
> >
> > BTW: I couldn't build powerpc without mutex debugging until I changed
> > the SYNC_ON_SMP in include/asm-powerpc/mutex.h:__mutex_fastpath_unlock
> > to ISYNC_ON_SMP.
> >
> > With that change, I was able to boot semi-successfully with mutex
> > debugging off -- the system got hung up when udev started, apparently
> > (or maybe I was too impatient).
>
> ugh! Does the patch below get you a working system with DEBUG_MUTEXES=n?


Yes, this gets me to a login prompt, thanks.

2006-01-18 08:08:23

by Nathan Lynch

[permalink] [raw]
Subject: Re: 2.6.15-mm4 failure on power5

Nathan Lynch wrote:
> Dave C Boutcher wrote:
> > On Tue, Jan 17, 2006 at 10:52:44AM -0600, Dave C Boutcher wrote:
> > > Well, it turns out that I've been running with CONFIG_DEBUG_MUTEXES all
> > > along...so no noise. My console output is a little different that
> > > Serge's, so I think this is timing related. Also note that I'm dying in
> > > the timer interrupt...
> >
> > duh... here's the backtrace
> > 0:mon> t
> > [c000000000577890] c0000000000034b4 decrementer_common+0xb4/0x100
> > --- Exception: 901 (Decrementer) at c0000000004627ec
> > .__mutex_lock_interruptible_slowpath+0x3bc/0x4c4
> > [c000000000577c60] c000000000075064 .__lock_cpu_hotplug+0x44/0xa8
> > [c000000000577ce0] c000000000075600 .register_cpu_notifier+0x24/0x68
> > [c000000000577d70] c00000000052cd7c .do_init_bootmem+0x68c/0xab0
> > [c000000000577e50] c000000000522c84 .setup_arch+0x21c/0x2c0
> > [c000000000577ef0] c00000000051a538 .start_kernel+0x40/0x280
> > [c000000000577f90] c000000000008574 .hmt_init+0x0/0x8c
>
> The mutex debug code (debug_spin_unlock in kernel/mutex-debug.h) is
> doing a local_irq_enable way before we're ready.

Looks like not only the powerpc setup_arch code could trigger this --
rcu_init, init_timers, and sched_init all do register_cpu_notifier
(and hence mutex_lock, therefore potentially enabling interrupts too
early in the mutex debug case) before the initial local_irq_enable in
start_kernel.

2006-01-18 08:08:27

by Ingo Molnar

[permalink] [raw]
Subject: Re: 2.6.15-mm4 failure on power5


* Andrew Morton <[email protected]> wrote:

> > Yes, which would be why this code never triggered a warning when
> > cpucontrol was a semaphore.
>
> Yup. Perhaps a sane fix which preserves the unpleasant semantics is
> to do irqsave in the mutex debug code.

i'd much rather remove that ugly hack from __might_sleep(). How many
other bugs does it hide? Does it hide bugs that dont normally trigger
during bootups on real hardware, but which could trigger on e.g. UML or
on Xen? I really think such ugly workarounds are not justified, if other
arches can get their act together. Would you make such an exception for
other arches too, like ARM?

an irqsave in the mutex debug code will uglify the kernel/mutex.c code -
i'd have to add extra "unsigned long flags" lines. [It will also slow
down the debug code a bit - an extra PUSHF has to be done.]

Ingo

2006-01-18 08:25:37

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.15-mm4 failure on power5

Ingo Molnar <[email protected]> wrote:
>
>
> * Andrew Morton <[email protected]> wrote:
>
> > > Yes, which would be why this code never triggered a warning when
> > > cpucontrol was a semaphore.
> >
> > Yup. Perhaps a sane fix which preserves the unpleasant semantics is
> > to do irqsave in the mutex debug code.
>
> i'd much rather remove that ugly hack from __might_sleep(). How many
> other bugs does it hide?

Gee, it was 2.6.0-test9. I don't remember, but I do recall the problems
were really really nasty, and what's the point? We're only running one
thread on one CPU at that time, so none of these things _will_ sleep.

> Does it hide bugs that dont normally trigger
> during bootups on real hardware, but which could trigger on e.g. UML or
> on Xen? I really think such ugly workarounds are not justified, if other
> arches can get their act together. Would you make such an exception for
> other arches too, like ARM?

Don't care really, as long as a) the problems don't hit -mm or mainline and
b) someone else fixes them. Yes, it'd be nice to fix these things, and we
might even find real bugs. Perhaps things are better now, but I suspect
it's a can of worms.

> an irqsave in the mutex debug code will uglify the kernel/mutex.c code -
> i'd have to add extra "unsigned long flags" lines. [It will also slow
> down the debug code a bit - an extra PUSHF has to be done.]

Small cost, really...

2006-01-18 09:02:50

by Ingo Molnar

[permalink] [raw]
Subject: [patch] work around ppc64 bootup bug by making mutex-debugging save/restore irqs


* Andrew Morton <[email protected]> wrote:

> > Does it hide bugs that dont normally trigger
> > during bootups on real hardware, but which could trigger on e.g. UML or
> > on Xen? I really think such ugly workarounds are not justified, if other
> > arches can get their act together. Would you make such an exception for
> > other arches too, like ARM?
>
> Don't care really, as long as a) the problems don't hit -mm or
> mainline and b) someone else fixes them. Yes, it'd be nice to fix
> these things, and we might even find real bugs. Perhaps things are
> better now, but I suspect it's a can of worms.

patch to mutex code below.

--

it seems ppc64 wants to lock mutexes in early bootup code, with
interrupts disabled, and they expect interrupts to stay disabled, else
they crash.

work this bug around by making mutex debugging variants save/restore irq
flags.

Signed-off-by: Ingo Molnar <[email protected]>

----

kernel/mutex-debug.c | 12 ++++++------
kernel/mutex-debug.h | 25 +++++--------------------
kernel/mutex.c | 21 ++++++++++++---------
kernel/mutex.h | 6 ++++--
4 files changed, 27 insertions(+), 37 deletions(-)

Index: linux/kernel/mutex-debug.c
===================================================================
--- linux.orig/kernel/mutex-debug.c
+++ linux/kernel/mutex-debug.c
@@ -153,13 +153,13 @@ next:
continue;
count++;
cursor = curr->next;
- debug_spin_lock_restore(&debug_mutex_lock, flags);
+ debug_spin_unlock_restore(&debug_mutex_lock, flags);

printk("\n#%03d: ", count);
printk_lock(lock, filter ? 0 : 1);
goto next;
}
- debug_spin_lock_restore(&debug_mutex_lock, flags);
+ debug_spin_unlock_restore(&debug_mutex_lock, flags);
printk("\n");
}

@@ -316,7 +316,7 @@ void mutex_debug_check_no_locks_held(str
continue;
list_del_init(curr);
DEBUG_OFF();
- debug_spin_lock_restore(&debug_mutex_lock, flags);
+ debug_spin_unlock_restore(&debug_mutex_lock, flags);

printk("BUG: %s/%d, lock held at task exit time!\n",
task->comm, task->pid);
@@ -325,7 +325,7 @@ void mutex_debug_check_no_locks_held(str
printk("exiting task is not even the owner??\n");
return;
}
- debug_spin_lock_restore(&debug_mutex_lock, flags);
+ debug_spin_unlock_restore(&debug_mutex_lock, flags);
}

/*
@@ -352,7 +352,7 @@ void mutex_debug_check_no_locks_freed(co
continue;
list_del_init(curr);
DEBUG_OFF();
- debug_spin_lock_restore(&debug_mutex_lock, flags);
+ debug_spin_unlock_restore(&debug_mutex_lock, flags);

printk("BUG: %s/%d, active lock [%p(%p-%p)] freed!\n",
current->comm, current->pid, lock, from, to);
@@ -362,7 +362,7 @@ void mutex_debug_check_no_locks_freed(co
printk("freeing task is not even the owner??\n");
return;
}
- debug_spin_lock_restore(&debug_mutex_lock, flags);
+ debug_spin_unlock_restore(&debug_mutex_lock, flags);
}

/*
Index: linux/kernel/mutex-debug.h
===================================================================
--- linux.orig/kernel/mutex-debug.h
+++ linux/kernel/mutex-debug.h
@@ -46,21 +46,6 @@ extern void mutex_remove_waiter(struct m
extern void debug_mutex_unlock(struct mutex *lock);
extern void debug_mutex_init(struct mutex *lock, const char *name);

-#define debug_spin_lock(lock) \
- do { \
- local_irq_disable(); \
- if (debug_mutex_on) \
- spin_lock(lock); \
- } while (0)
-
-#define debug_spin_unlock(lock) \
- do { \
- if (debug_mutex_on) \
- spin_unlock(lock); \
- local_irq_enable(); \
- preempt_check_resched(); \
- } while (0)
-
#define debug_spin_lock_save(lock, flags) \
do { \
local_irq_save(flags); \
@@ -68,7 +53,7 @@ extern void debug_mutex_init(struct mute
spin_lock(lock); \
} while (0)

-#define debug_spin_lock_restore(lock, flags) \
+#define debug_spin_unlock_restore(lock, flags) \
do { \
if (debug_mutex_on) \
spin_unlock(lock); \
@@ -76,20 +61,20 @@ extern void debug_mutex_init(struct mute
preempt_check_resched(); \
} while (0)

-#define spin_lock_mutex(lock) \
+#define spin_lock_mutex(lock, flags) \
do { \
struct mutex *l = container_of(lock, struct mutex, wait_lock); \
\
DEBUG_WARN_ON(in_interrupt()); \
- debug_spin_lock(&debug_mutex_lock); \
+ debug_spin_lock_save(&debug_mutex_lock, flags); \
spin_lock(lock); \
DEBUG_WARN_ON(l->magic != l); \
} while (0)

-#define spin_unlock_mutex(lock) \
+#define spin_unlock_mutex(lock, flags) \
do { \
spin_unlock(lock); \
- debug_spin_unlock(&debug_mutex_lock); \
+ debug_spin_unlock_restore(&debug_mutex_lock, flags); \
} while (0)

#define DEBUG_OFF() \
Index: linux/kernel/mutex.c
===================================================================
--- linux.orig/kernel/mutex.c
+++ linux/kernel/mutex.c
@@ -125,10 +125,11 @@ __mutex_lock_common(struct mutex *lock,
struct task_struct *task = current;
struct mutex_waiter waiter;
unsigned int old_val;
+ unsigned long flags;

debug_mutex_init_waiter(&waiter);

- spin_lock_mutex(&lock->wait_lock);
+ spin_lock_mutex(&lock->wait_lock, flags);

debug_mutex_add_waiter(lock, &waiter, task->thread_info, ip);

@@ -157,7 +158,7 @@ __mutex_lock_common(struct mutex *lock,
if (unlikely(state == TASK_INTERRUPTIBLE &&
signal_pending(task))) {
mutex_remove_waiter(lock, &waiter, task->thread_info);
- spin_unlock_mutex(&lock->wait_lock);
+ spin_unlock_mutex(&lock->wait_lock, flags);

debug_mutex_free_waiter(&waiter);
return -EINTR;
@@ -165,9 +166,9 @@ __mutex_lock_common(struct mutex *lock,
__set_task_state(task, state);

/* didnt get the lock, go to sleep: */
- spin_unlock_mutex(&lock->wait_lock);
+ spin_unlock_mutex(&lock->wait_lock, flags);
schedule();
- spin_lock_mutex(&lock->wait_lock);
+ spin_lock_mutex(&lock->wait_lock, flags);
}

/* got the lock - rejoice! */
@@ -178,7 +179,7 @@ __mutex_lock_common(struct mutex *lock,
if (likely(list_empty(&lock->wait_list)))
atomic_set(&lock->count, 0);

- spin_unlock_mutex(&lock->wait_lock);
+ spin_unlock_mutex(&lock->wait_lock, flags);

debug_mutex_free_waiter(&waiter);

@@ -203,10 +204,11 @@ static fastcall noinline void
__mutex_unlock_slowpath(atomic_t *lock_count __IP_DECL__)
{
struct mutex *lock = container_of(lock_count, struct mutex, count);
+ unsigned long flags;

DEBUG_WARN_ON(lock->owner != current_thread_info());

- spin_lock_mutex(&lock->wait_lock);
+ spin_lock_mutex(&lock->wait_lock, flags);

/*
* some architectures leave the lock unlocked in the fastpath failure
@@ -231,7 +233,7 @@ __mutex_unlock_slowpath(atomic_t *lock_c

debug_mutex_clear_owner(lock);

- spin_unlock_mutex(&lock->wait_lock);
+ spin_unlock_mutex(&lock->wait_lock, flags);
}

/*
@@ -276,9 +278,10 @@ __mutex_lock_interruptible_slowpath(atom
static inline int __mutex_trylock_slowpath(atomic_t *lock_count)
{
struct mutex *lock = container_of(lock_count, struct mutex, count);
+ unsigned long flags;
int prev;

- spin_lock_mutex(&lock->wait_lock);
+ spin_lock_mutex(&lock->wait_lock, flags);

prev = atomic_xchg(&lock->count, -1);
if (likely(prev == 1))
@@ -287,7 +290,7 @@ static inline int __mutex_trylock_slowpa
if (likely(list_empty(&lock->wait_list)))
atomic_set(&lock->count, 0);

- spin_unlock_mutex(&lock->wait_lock);
+ spin_unlock_mutex(&lock->wait_lock, flags);

return prev == 1;
}
Index: linux/kernel/mutex.h
===================================================================
--- linux.orig/kernel/mutex.h
+++ linux/kernel/mutex.h
@@ -9,8 +9,10 @@
* !CONFIG_DEBUG_MUTEXES case. Most of them are NOPs:
*/

-#define spin_lock_mutex(lock) spin_lock(lock)
-#define spin_unlock_mutex(lock) spin_unlock(lock)
+#define spin_lock_mutex(lock, flags) \
+ do { spin_lock(lock); (void)(flags); } while (0)
+#define spin_unlock_mutex(lock, flags) \
+ do { spin_unlock(lock); (void)(flags); } while (0)
#define mutex_remove_waiter(lock, waiter, ti) \
__list_del((waiter)->list.prev, (waiter)->list.next)

2006-01-18 09:18:59

by Ingo Molnar

[permalink] [raw]
Subject: [patch] turn on might_sleep() in early bootup code too


Could we try the patch below in -mm, to get a feeling of how widespread
the early bootup lock-atomicity assumptions are? I also added a .config
option to add back the workaround, and turned it on for ppc64. (users
can still select this manually on any other arch as well)

i found one such bug on x86: early-printk calls register_console(),
which acquires console_sem. Since we can work such things around
per-lock and per-assumption [see the printk.c changes], i think we
should rather do the workarounds that way, and thus document the hacks
we need - without impacting the ability of such platforms to boot -
while still keeping might_sleep() checks widely enabled. (btw., x86
still booted fine, despite the warning.)

Ingo

--

enable might_sleep() checks even in early bootup code (when system_state
!= SYSTEM_RUNNING). There's also a new config option to turn this off:
CONFIG_DEBUG_SPINLOCK_SLEEP_EARLY_BOOTUP_WORKAROUND
while most other architectures.

the patch also documents and works around an EARLY_PRINTK locking
dependency. [which we might want to get rid of in the future.]

tested on x86.

Signed-off-by: Ingo Molnar <[email protected]>

----

arch/powerpc/Kconfig.debug | 2 ++
kernel/printk.c | 11 ++++++++++-
kernel/sched.c | 7 +++++--
lib/Kconfig.debug | 11 +++++++++++
4 files changed, 28 insertions(+), 3 deletions(-)

Index: linux/arch/powerpc/Kconfig.debug
===================================================================
--- linux.orig/arch/powerpc/Kconfig.debug
+++ linux/arch/powerpc/Kconfig.debug
@@ -2,6 +2,8 @@ menu "Kernel hacking"

source "lib/Kconfig.debug"

+select CONFIG_DEBUG_SPINLOCK_SLEEP_EARLY_BOOTUP_WORKAROUND
+
config DEBUG_STACKOVERFLOW
bool "Check for stack overflows"
depends on DEBUG_KERNEL && PPC64
Index: linux/kernel/printk.c
===================================================================
--- linux.orig/kernel/printk.c
+++ linux/kernel/printk.c
@@ -710,7 +710,16 @@ void acquire_console_sem(void)
{
if (in_interrupt())
BUG();
- down(&console_sem);
+ /*
+ * Early-printk wants to acquire the console_sem in
+ * register_console(). Make a special exception for them by
+ * going via trylock first, which doesnt trigger the
+ * might_sleep() atomicity check.
+ */
+#ifdef CONFIG_EARLY_PRINTK
+ if (down_trylock(&console_sem))
+#endif
+ down(&console_sem);
console_locked = 1;
console_may_schedule = 1;
}
Index: linux/kernel/sched.c
===================================================================
--- linux.orig/kernel/sched.c
+++ linux/kernel/sched.c
@@ -6285,8 +6285,11 @@ void __might_sleep(char *file, int line)
#if defined(in_atomic)
static unsigned long prev_jiffy; /* ratelimiting */

- if ((in_atomic() || irqs_disabled()) &&
- system_state == SYSTEM_RUNNING && !oops_in_progress) {
+#ifdef CONFIG_DEBUG_SPINLOCK_SLEEP_EARLY_BOOTUP_WORKAROUND
+ if (system_state != SYSTEM_RUNNING)
+ return;
+#endif
+ if ((in_atomic() || irqs_disabled()) && !oops_in_progress) {
if (time_before(jiffies, prev_jiffy + HZ) && prev_jiffy)
return;
prev_jiffy = jiffies;
Index: linux/lib/Kconfig.debug
===================================================================
--- linux.orig/lib/Kconfig.debug
+++ linux/lib/Kconfig.debug
@@ -128,6 +128,17 @@ config DEBUG_SPINLOCK_SLEEP
If you say Y here, various routines which may sleep will become very
noisy if they are called with a spinlock held.

+config DEBUG_SPINLOCK_SLEEP_EARLY_BOOTUP_WORKAROUND
+ bool "Work around sleep-inside-spinlock checking in early bootup code"
+ depends on DEBUG_SPINLOCK_SLEEP
+ help
+ If you say Y here, then early bootup code will not check for
+ "do not call potentially sleeping functions in atomic sections"
+ rule, that the DEBUG_SPINLOCK_SLEEP option enforces.
+
+ You want to say N here, unless your system does not boot with
+ "Y" here.
+
config DEBUG_KOBJECT
bool "kobject debugging"
depends on DEBUG_KERNEL

2006-01-18 10:35:53

by Andrew Morton

[permalink] [raw]
Subject: Re: [patch] turn on might_sleep() in early bootup code too

Ingo Molnar <[email protected]> wrote:
>
> enable might_sleep() checks even in early bootup code (when system_state
> != SYSTEM_RUNNING). There's also a new config option to turn this off:
> CONFIG_DEBUG_SPINLOCK_SLEEP_EARLY_BOOTUP_WORKAROUND
> while most other architectures.

I get just the one on ppc64:


Debug: sleeping function called from invalid context at include/asm/semaphore.h:62
in_atomic():1, irqs_disabled():1
Call Trace:
[C0000000004EFD20] [C00000000000F660] .show_stack+0x5c/0x1cc (unreliable)
[C0000000004EFDD0] [C000000000053214] .__might_sleep+0xbc/0xe0
[C0000000004EFE60] [C000000000413D1C] .lock_kernel+0x50/0xb0
[C0000000004EFEF0] [C0000000004AC574] .start_kernel+0x1c/0x278
[C0000000004EFF90] [C0000000000085D4] .hmt_init+0x0/0x2c


Your fault ;)

2006-01-18 10:43:29

by Ingo Molnar

[permalink] [raw]
Subject: Re: [patch] turn on might_sleep() in early bootup code too


* Andrew Morton <[email protected]> wrote:

> Ingo Molnar <[email protected]> wrote:
> >
> > enable might_sleep() checks even in early bootup code (when system_state
> > != SYSTEM_RUNNING). There's also a new config option to turn this off:
> > CONFIG_DEBUG_SPINLOCK_SLEEP_EARLY_BOOTUP_WORKAROUND
> > while most other architectures.
>
> I get just the one on ppc64:
>
>
> Debug: sleeping function called from invalid context at include/asm/semaphore.h:62
> in_atomic():1, irqs_disabled():1
> Call Trace:
> [C0000000004EFD20] [C00000000000F660] .show_stack+0x5c/0x1cc (unreliable)
> [C0000000004EFDD0] [C000000000053214] .__might_sleep+0xbc/0xe0
> [C0000000004EFE60] [C000000000413D1C] .lock_kernel+0x50/0xb0
> [C0000000004EFEF0] [C0000000004AC574] .start_kernel+0x1c/0x278
> [C0000000004EFF90] [C0000000000085D4] .hmt_init+0x0/0x2c
>
>
> Your fault ;)

yes :-) I have a really ugly workaround in my tree that is definitely
not worth posting. I think to do this cleanly i'll add trylock_kernel(),
and do this in main.c:

BUG_ON(!trylock_kernel());

but there's another one that is much nastier in terms of scope:

BUG: sleeping function called from invalid context at kernel/mutex.c:256
in_atomic():0, irqs_disabled():1
[<c0103db6>] show_trace+0xd/0xf
[<c0103dcd>] dump_stack+0x15/0x17
[<c011ff4b>] __might_sleep+0x64/0x6c
[<c105b470>] mutex_lock_interruptible+0x15/0x22
[<c013d81f>] __lock_cpu_hotplug+0x26/0x52
[<c013d858>] lock_cpu_hotplug_interruptible+0xd/0xf
[<c013d922>] register_cpu_notifier+0xc/0x2b
[<c1dac88f>] page_alloc_init+0xd/0xf
[<c1d992ee>] start_kernel+0x125/0x376
[<c0100210>] 0xc0100210

this is what is causing the ppc64 problems too i think.

lock_cpu_hotplug() has design problems i think: hotplug-locked sections
are slowly spreading in the kernel, encompassing more and more code :-)
Shouldnt the CPU hotplug lock be a spinlock to begin with?

Ingo

2006-01-18 10:47:08

by Nick Piggin

[permalink] [raw]
Subject: Re: [patch] turn on might_sleep() in early bootup code too

Andrew Morton wrote:
> Ingo Molnar <[email protected]> wrote:
>
>> enable might_sleep() checks even in early bootup code (when system_state
>> != SYSTEM_RUNNING). There's also a new config option to turn this off:
>> CONFIG_DEBUG_SPINLOCK_SLEEP_EARLY_BOOTUP_WORKAROUND
>> while most other architectures.
>
>
> I get just the one on ppc64:
>
>
> Debug: sleeping function called from invalid context at include/asm/semaphore.h:62
> in_atomic():1, irqs_disabled():1
> Call Trace:
> [C0000000004EFD20] [C00000000000F660] .show_stack+0x5c/0x1cc (unreliable)
> [C0000000004EFDD0] [C000000000053214] .__might_sleep+0xbc/0xe0
> [C0000000004EFE60] [C000000000413D1C] .lock_kernel+0x50/0xb0
> [C0000000004EFEF0] [C0000000004AC574] .start_kernel+0x1c/0x278
> [C0000000004EFF90] [C0000000000085D4] .hmt_init+0x0/0x2c
>
>
> Your fault ;)

This lock_kernel should never sleep should it? Maybe it could be changed
to lock_kernel_init_locked() or something?

--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com

2006-01-18 11:07:53

by Ingo Molnar

[permalink] [raw]
Subject: Re: [patch] turn on might_sleep() in early bootup code too


* Nick Piggin <[email protected]> wrote:

> Andrew Morton wrote:
> >Ingo Molnar <[email protected]> wrote:
> >
> >>enable might_sleep() checks even in early bootup code (when system_state
> >>!= SYSTEM_RUNNING). There's also a new config option to turn this off:
> >>CONFIG_DEBUG_SPINLOCK_SLEEP_EARLY_BOOTUP_WORKAROUND
> >>while most other architectures.
> >
> >
> >I get just the one on ppc64:
> >
> >
> >Debug: sleeping function called from invalid context at
> >include/asm/semaphore.h:62
> >in_atomic():1, irqs_disabled():1
> >Call Trace:
> >[C0000000004EFD20] [C00000000000F660] .show_stack+0x5c/0x1cc (unreliable)
> >[C0000000004EFDD0] [C000000000053214] .__might_sleep+0xbc/0xe0
> >[C0000000004EFE60] [C000000000413D1C] .lock_kernel+0x50/0xb0
> >[C0000000004EFEF0] [C0000000004AC574] .start_kernel+0x1c/0x278
> >[C0000000004EFF90] [C0000000000085D4] .hmt_init+0x0/0x2c
> >
> >
> >Your fault ;)
>
> This lock_kernel should never sleep should it? Maybe it could be
> changed to lock_kernel_init_locked() or something?

the way i fixed it in my tree was to add a trylock_kernel(), and to
check for success in init/main.c. See the patch below.

Ingo

--

introduce trylock_kernel(), to be used by the early init code to acquire
the BKL in an atomic way.

Signed-off-by: Ingo Molnar <[email protected]>

----

Index: linux/include/linux/smp_lock.h
===================================================================
--- linux.orig/include/linux/smp_lock.h
+++ linux/include/linux/smp_lock.h
@@ -39,6 +39,7 @@ static inline int reacquire_kernel_lock(
}

extern void __lockfunc lock_kernel(void) __acquires(kernel_lock);
+extern int __lockfunc trylock_kernel(void);
extern void __lockfunc unlock_kernel(void) __releases(kernel_lock);

#else
Index: linux/init/main.c
===================================================================
--- linux.orig/init/main.c
+++ linux/init/main.c
@@ -443,11 +443,14 @@ asmlinkage void __init start_kernel(void
{
char * command_line;
extern struct kernel_param __start___param[], __stop___param[];
-/*
- * Interrupts are still disabled. Do necessary setups, then
- * enable them
- */
- lock_kernel();
+
+ /*
+ * Interrupts are still disabled. Do necessary setups, then
+ * enable them. This is the first time we take the BKL, so
+ * it must succeed:
+ */
+ if (!trylock_kernel())
+ WARN_ON(1);
page_address_init();
printk(KERN_NOTICE);
printk(linux_banner);
@@ -466,6 +469,7 @@ asmlinkage void __init start_kernel(void
* time - but meanwhile we still have a functioning scheduler.
*/
sched_init();
+ mutex_key_hash_init();
/*
* Disable preemption - early bootup scheduling is extremely
* fragile until we cpu_idle() for the first time.
Index: linux/lib/kernel_lock.c
===================================================================
--- linux.orig/lib/kernel_lock.c
+++ linux/lib/kernel_lock.c
@@ -76,6 +76,23 @@ void __lockfunc lock_kernel(void)
task->lock_depth = depth;
}

+int __lockfunc trylock_kernel(void)
+{
+ struct task_struct *task = current;
+ int depth = task->lock_depth + 1;
+
+ if (likely(!depth)) {
+ if (unlikely(down_trylock(&kernel_sem)))
+ return 0;
+ else
+ __acquire(kernel_sem);
+ }
+
+ task->lock_depth = depth;
+ return 1;
+}
+
+
void __lockfunc unlock_kernel(void)
{
struct task_struct *task = current;
@@ -194,6 +211,25 @@ void __lockfunc lock_kernel(void)
current->lock_depth = depth;
}

+int __lockfunc trylock_kernel(void)
+{
+ struct task_struct *task = current;
+ int depth = task->lock_depth + 1;
+
+ if (likely(!depth)) {
+ if (unlikely(!spin_trylock(&kernel_flag)))
+ return 0;
+ else
+ __acquire(kernel_sem);
+ }
+
+ if (likely(!depth) && unlikely(!spin_trylock(&kernel_flag)))
+ return 0;
+
+ task->lock_depth = depth;
+ return 1;
+}
+
void __lockfunc unlock_kernel(void)
{
BUG_ON(current->lock_depth < 0);
@@ -204,5 +240,6 @@ void __lockfunc unlock_kernel(void)
#endif

EXPORT_SYMBOL(lock_kernel);
+/* we do not export trylock_kernel(). BKL code should shrink :-) */
EXPORT_SYMBOL(unlock_kernel);

2006-01-18 11:15:27

by Ingo Molnar

[permalink] [raw]
Subject: [patch] make bug messages more consistent

while we are changing debugging code. One problem is that we've got a
hodgepodge of bug messages right now, and i frequently miss e.g.
'Badness' messages from the kernel because they simply do not stick out
visually. 'BUG' is much more apparent and also makes it obvious that
there's a kernel bug here. Here's a patch that makes the messages more
consistent:

--

consolidate all kernel bug printouts to begin with the "BUG: " string.
Makes it easier to find them in large bootup logs.

Signed-off-by: Ingo Molnar <[email protected]>

----

include/asm-generic/bug.h | 4 ++--
kernel/sched.c | 4 ++--
2 files changed, 4 insertions(+), 4 deletions(-)

Index: linux/include/asm-generic/bug.h
===================================================================
--- linux.orig/include/asm-generic/bug.h
+++ linux/include/asm-generic/bug.h
@@ -7,7 +7,7 @@
#ifdef CONFIG_BUG
#ifndef HAVE_ARCH_BUG
#define BUG() do { \
- printk("kernel BUG at %s:%d!\n", __FILE__, __LINE__); \
+ printk("BUG: failure at %s:%d/%s()!\n", __FILE__, __LINE__, __FUNCTION__); \
panic("BUG!"); \
} while (0)
#endif
@@ -19,7 +19,7 @@
#ifndef HAVE_ARCH_WARN_ON
#define WARN_ON(condition) do { \
if (unlikely((condition)!=0)) { \
- printk("Badness in %s at %s:%d\n", __FUNCTION__, __FILE__, __LINE__); \
+ printk("BUG: warning at %s:%d/%s()\n", __FILE__, __LINE__, __FUNCTION__); \
dump_stack(); \
} \
} while (0)
Index: linux/kernel/sched.c
===================================================================
--- linux.orig/kernel/sched.c
+++ linux/kernel/sched.c
@@ -2973,7 +2973,7 @@ asmlinkage void __sched schedule(void)
*/
if (likely(!current->exit_state)) {
if (unlikely(in_atomic())) {
- printk(KERN_ERR "scheduling while atomic: "
+ printk(KERN_ERR "BUG: scheduling while atomic: "
"%s/0x%08x/%d\n",
current->comm, preempt_count(), current->pid);
dump_stack();
@@ -6293,7 +6293,7 @@ void __might_sleep(char *file, int line)
if (time_before(jiffies, prev_jiffy + HZ) && prev_jiffy)
return;
prev_jiffy = jiffies;
- printk(KERN_ERR "Debug: sleeping function called from invalid"
+ printk(KERN_ERR "BUG: sleeping function called from invalid"
" context at %s:%d\n", file, line);
printk("in_atomic():%d, irqs_disabled():%d\n",
in_atomic(), irqs_disabled());

2006-01-18 12:53:22

by Ingo Molnar

[permalink] [raw]
Subject: [patch] add trylock_kernel()


* Ingo Molnar <[email protected]> wrote:

> the way i fixed it in my tree was to add a trylock_kernel(), and to
> check for success in init/main.c. See the patch below.

i had a silly bug in the spinlock variant, and some extra unneeded
change from another debug patch - fixed patch is below. Tested on x86,
with and without CONFIG_PREEMPT_BKL.

Ingo

--
introduce trylock_kernel(), to be used by the early init code to acquire
the BKL in an atomic way.

Signed-off-by: Ingo Molnar <[email protected]>

----

include/linux/smp_lock.h | 1 +
init/main.c | 13 ++++++++-----
lib/kernel_lock.c | 34 ++++++++++++++++++++++++++++++++++
3 files changed, 43 insertions(+), 5 deletions(-)

Index: linux/include/linux/smp_lock.h
===================================================================
--- linux.orig/include/linux/smp_lock.h
+++ linux/include/linux/smp_lock.h
@@ -39,6 +39,7 @@ static inline int reacquire_kernel_lock(
}

extern void __lockfunc lock_kernel(void) __acquires(kernel_lock);
+extern int __lockfunc trylock_kernel(void);
extern void __lockfunc unlock_kernel(void) __releases(kernel_lock);

#else
Index: linux/init/main.c
===================================================================
--- linux.orig/init/main.c
+++ linux/init/main.c
@@ -443,11 +443,14 @@ asmlinkage void __init start_kernel(void
{
char * command_line;
extern struct kernel_param __start___param[], __stop___param[];
-/*
- * Interrupts are still disabled. Do necessary setups, then
- * enable them
- */
- lock_kernel();
+
+ /*
+ * Interrupts are still disabled. Do necessary setups, then
+ * enable them. This is the first time we take the BKL, so
+ * it must succeed:
+ */
+ if (!trylock_kernel())
+ WARN_ON(1);
page_address_init();
printk(KERN_NOTICE);
printk(linux_banner);
Index: linux/lib/kernel_lock.c
===================================================================
--- linux.orig/lib/kernel_lock.c
+++ linux/lib/kernel_lock.c
@@ -76,6 +76,23 @@ void __lockfunc lock_kernel(void)
task->lock_depth = depth;
}

+int __lockfunc trylock_kernel(void)
+{
+ struct task_struct *task = current;
+ int depth = task->lock_depth + 1;
+
+ if (likely(!depth)) {
+ if (unlikely(down_trylock(&kernel_sem)))
+ return 0;
+ else
+ __acquire(kernel_sem);
+ }
+
+ task->lock_depth = depth;
+ return 1;
+}
+
+
void __lockfunc unlock_kernel(void)
{
struct task_struct *task = current;
@@ -194,6 +211,22 @@ void __lockfunc lock_kernel(void)
current->lock_depth = depth;
}

+int __lockfunc trylock_kernel(void)
+{
+ struct task_struct *task = current;
+ int depth = task->lock_depth + 1;
+
+ if (likely(!depth)) {
+ if (unlikely(!spin_trylock(&kernel_flag)))
+ return 0;
+ else
+ __acquire(kernel_sem);
+ }
+
+ task->lock_depth = depth;
+ return 1;
+}
+
void __lockfunc unlock_kernel(void)
{
BUG_ON(current->lock_depth < 0);
@@ -204,5 +237,6 @@ void __lockfunc unlock_kernel(void)
#endif

EXPORT_SYMBOL(lock_kernel);
+/* we do not export trylock_kernel(). BKL code should shrink :-) */
EXPORT_SYMBOL(unlock_kernel);

2006-01-18 14:43:31

by Serge E. Hallyn

[permalink] [raw]
Subject: Re: 2.6.15-mm4 failure on power5

Quoting Michael Ellerman ([email protected]):
> On Mon, 16 Jan 2006 18:05, Andrew Morton wrote:
> > "Serge E. Hallyn" <[email protected]> wrote:
> > > On my power5 partition, 2.6.15-mm4 hangs on boot
> >
> > It might be worth reverting the changes to arch/powerpc/mm/hash_utils_64.c,
> > see if that unbreaks it.
> >
> > - base = lmb.memory.region[i].base + KERNELBASE;
> > + base = (unsigned long)__va(lmb.memory.region[i].base);
>
> You can try it, but if that fixes the problem I'll buy a sombrero and then eat
> it.

Sounds unpleasant, but no need - that didn't fix it.

> > The nice comment in page.h:
> >
> > * KERNELBASE is the virtual address of the start of the kernel, it's often
> > * the same as PAGE_OFFSET, but _might not be_.
> > *
> > * The kdump dump kernel is one example where KERNELBASE != PAGE_OFFSET.
> > *
> > * To get a physical address from a virtual one you subtract PAGE_OFFSET,
> > * _not_ KERNELBASE.
> >
> > Tells us that was not an equivalent transformation.
>
> True, not equivalent in all cases, but correct. For non-kdump kernels (which I
> assume this is) KERNELBASE == PAGE_OFFSET, and for a kdump kernel that code
> wants to use PAGE_OFFSET, not KERNELBASE.
>
> Try enabling early debugging (see arch/powerpc/kernel/setup_64.c) and then
> turning on DEBUG in hash_utils_64.c, setup_64.c etc.

That gives me the following output:

boot: quicktest
Please wait, loading kernel...
Elf64 kernel loaded...
OF stdout device is: /vdevice/vty@30000000
Hypertas detected, assuming LPAR !
command line: ro console=hvc0 root=/dev/sda6 smt-enabled=1
memory layout at init:
memory_limit : 0000000000000000 (16 MB aligned)
alloc_bottom : 0000000002223000
alloc_top : 0000000008000000
alloc_top_hi : 0000000088000000
rmo_top : 0000000008000000
ram_top : 0000000088000000
Looking for displays
instantiating rtas at 0x00000000077d7000 ... done
0000000000000000 : boot cpu 0000000000000000
0000000000000002 : starting cpu hw idx 0000000000000002... done
0000000000000004 : starting cpu hw idx 0000000000000004... done
0000000000000006 : starting cpu hw idx 0000000000000006... done
copying OF device tree ...
Building dt strings...
Building dt structure...
Device tree strings 0x0000000002424000 -> 0x0000000002424f36
Device tree struct 0x0000000002425000 -> 0x000000000242c000
Calling quiesce ...
returning from prom_init
-> early_setup()
Probing machine type for platform 101...
Found, Initializing memory management...
-> htab_initialize()
creating mapping for region: c000000000000000 : 88000000
<- htab_initialize()
<- early_setup()
-> setup_system()
-> initialize_cache_info()
<- initialize_cache_info()
Page orders: linear mapping = 24, others = 12
-> smp_release_cpus()
<- smp_release_cpus()
<- setup_system()

So setup_system() at least finishes, though I don't see the
printk's at the bottom of that function.

thanks,
-serge

2006-01-19 04:34:49

by Zwane Mwaikambo

[permalink] [raw]
Subject: Re: [patch] turn on might_sleep() in early bootup code too

On Wed, 18 Jan 2006, Ingo Molnar wrote:

> lock_cpu_hotplug() has design problems i think: hotplug-locked sections
> are slowly spreading in the kernel, encompassing more and more code :-)
> Shouldnt the CPU hotplug lock be a spinlock to begin with?

The way it's used certainly is bizarre, but a spinlock would be harder to
work with as a lot of the code protected by it sleep.