2011-05-31 23:50:45

by Christian Kujau

[permalink] [raw]
Subject: 3.0-rc1: powerpc hangs at Kernel virtual memory layout

Hi,

trying to boot 3.0-rc1 on powerpc32 only progresses until:

> Kernel virtual memory layout:
> * 0xfffcf000..0xfffff000 : fixmap

And then the system hangs, does not respond to keyboard (sysrq does not
seem to work on this PowerBook G4). But after a while the system reboots
itself, so I guess the machine panicked but did not print anything on the
screen.

Full messages (picture), config & (working) dmesg:

http://nerdbynature.de/bits/3.0-rc1/

I'm currently trying to bisect this, so far I have:

----------------------
git bisect start
# good: [61c4f2c81c61f73549928dfd9f3e8f26aa36a8cf] Linux 2.6.39
git bisect good 61c4f2c81c61f73549928dfd9f3e8f26aa36a8cf
# bad: [55922c9d1b84b89cb946c777fddccb3247e7df2c] Linux 3.0-rc1
git bisect bad 55922c9d1b84b89cb946c777fddccb3247e7df2c
# bad: [c44dead70a841d90ddc01968012f323c33217c9e] Merge branch 'usb-next'
of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6
git bisect bad c44dead70a841d90ddc01968012f323c33217c9e
# bad: [d93515611bbc70c2fe4db232e5feb448ed8e4cc9] macvlan: fix panic if
lowerdev in a bond
git bisect bad d93515611bbc70c2fe4db232e5feb448ed8e4cc9
----------------------

Any ideas?

Thanks,
Christian.
--
BOFH excuse #263:

It's stuck in the Web.


2011-06-01 00:26:00

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: 3.0-rc1: powerpc hangs at Kernel virtual memory layout

On Tue, 2011-05-31 at 16:50 -0700, Christian Kujau wrote:
> Hi,
>
> trying to boot 3.0-rc1 on powerpc32 only progresses until:
>
> > Kernel virtual memory layout:
> > * 0xfffcf000..0xfffff000 : fixmap
>
> And then the system hangs, does not respond to keyboard (sysrq does not
> seem to work on this PowerBook G4). But after a while the system reboots
> itself, so I guess the machine panicked but did not print anything on the
> screen.
>
> Full messages (picture), config & (working) dmesg:
>
> http://nerdbynature.de/bits/3.0-rc1/
>
> I'm currently trying to bisect this, so far I have:

Hrm, I had it working on a pair of powerbooks yesterday. Can you try
something like "udbg-immortal" on your kernel command line to see if
that makes a difference in the output ?

Cheers,
Ben.

> ----------------------
> git bisect start
> # good: [61c4f2c81c61f73549928dfd9f3e8f26aa36a8cf] Linux 2.6.39
> git bisect good 61c4f2c81c61f73549928dfd9f3e8f26aa36a8cf
> # bad: [55922c9d1b84b89cb946c777fddccb3247e7df2c] Linux 3.0-rc1
> git bisect bad 55922c9d1b84b89cb946c777fddccb3247e7df2c
> # bad: [c44dead70a841d90ddc01968012f323c33217c9e] Merge branch 'usb-next'
> of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6
> git bisect bad c44dead70a841d90ddc01968012f323c33217c9e
> # bad: [d93515611bbc70c2fe4db232e5feb448ed8e4cc9] macvlan: fix panic if
> lowerdev in a bond
> git bisect bad d93515611bbc70c2fe4db232e5feb448ed8e4cc9
> ----------------------
>
> Any ideas?
>
> Thanks,
> Christian.

2011-06-01 00:48:20

by Christian Kujau

[permalink] [raw]
Subject: Re: 3.0-rc1: powerpc hangs at Kernel virtual memory layout

On Wed, 1 Jun 2011 at 10:25, Benjamin Herrenschmidt wrote:
> Hrm, I had it working on a pair of powerbooks yesterday. Can you try
> something like "udbg-immortal" on your kernel command line to see if
> that makes a difference in the output ?

I'll try in a minute.

In the meantime, "git bisect" behaves kinda weird, I don't know what went
wrong here:

$ git bisect start
$ git bisect good # Linux 2.6.39
$ git bisect bad v3.0-rc1 # Linux 3.0-rc1
$ git bisect bad # c44dead70a...
$ git bisect bad # d93515611b..

...yet the ./Makefile shows[0] that I'm already way behind: 2.6.39-rc2.
Maybe "git bisect" got confused with that whole 2.6.x -> 3.0 renaming?

Christian.

[0] http://nerdbynature.de/bits/3.0-rc1/
--
BOFH excuse #383:

Your processor has taken a ride to Heaven's Gate on the UFO behind Hale-Bopp's comet.

2011-06-01 01:08:54

by Christian Kujau

[permalink] [raw]
Subject: Re: 3.0-rc1: powerpc hangs at Kernel virtual memory layout

On Tue, 31 May 2011 at 17:48, Christian Kujau wrote:
> On Wed, 1 Jun 2011 at 10:25, Benjamin Herrenschmidt wrote:
> > Hrm, I had it working on a pair of powerbooks yesterday. Can you try
> > something like "udbg-immortal" on your kernel command line to see if
> > that makes a difference in the output ?
>
> I'll try in a minute.

Wow, it really did make a difference:

http://nerdbynature.de/bits/3.0-rc1/
* linux-3.0_powerpc_2.jpg
* linux-3.0_powerpc_2.mp4 (only a few(!) seconds long,
best to view with the slider in VLC oder Quicktime, to
get at least a grasp what lead to linux-3.0_powerpc_2.jpg)

Thanks,
Christian.
--
BOFH excuse #45:

virus attack, luser responsible

2011-06-01 03:02:52

by Christian Kujau

[permalink] [raw]
Subject: Re: 3.0-rc1: powerpc hangs at Kernel virtual memory layout

(Cc'in Linus)

On Tue, 31 May 2011 at 17:48, Christian Kujau wrote:
> In the meantime, "git bisect" behaves kinda weird, I don't know what went
> wrong here:
>
> $ git bisect start
> $ git bisect good # Linux 2.6.39
> $ git bisect bad v3.0-rc1 # Linux 3.0-rc1
> $ git bisect bad # c44dead70a...
> $ git bisect bad # d93515611b..
>
> ...yet the ./Makefile shows[0] that I'm already way behind: 2.6.39-rc2.
> Maybe "git bisect" got confused with that whole 2.6.x -> 3.0 renaming?

Hm, I tried again, from a clean v3.0-rc1 (git reset --hard), but after the
2nd "git bad" I'm at 2.6.39-rc2 again - while I /should/ be somwhere
inbetween v2.6.39..v3.0-rc1, right?

Help, please!
Christian.

[0] http://nerdbynature.de/bits/3.0-rc1/
--
BOFH excuse #54:

Evil dogs hypnotised the night shift

2011-06-01 03:49:20

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: 3.0-rc1: powerpc hangs at Kernel virtual memory layout

On Tue, 2011-05-31 at 20:02 -0700, Christian Kujau wrote:
> (Cc'in Linus)
>
> On Tue, 31 May 2011 at 17:48, Christian Kujau wrote:
> > In the meantime, "git bisect" behaves kinda weird, I don't know what went
> > wrong here:
> >
> > $ git bisect start
> > $ git bisect good # Linux 2.6.39
> > $ git bisect bad v3.0-rc1 # Linux 3.0-rc1
> > $ git bisect bad # c44dead70a...
> > $ git bisect bad # d93515611b..
> >
> > ...yet the ./Makefile shows[0] that I'm already way behind: 2.6.39-rc2.
> > Maybe "git bisect" got confused with that whole 2.6.x -> 3.0 renaming?
>
> Hm, I tried again, from a clean v3.0-rc1 (git reset --hard), but after the
> 2nd "git bad" I'm at 2.6.39-rc2 again - while I /should/ be somwhere
> inbetween v2.6.39..v3.0-rc1, right?

Kernel version is totally irrelevant when bisecting. You are not walking
through a linear series of patches but a complex tree of merges which
might have forked off different versions in the first place.

Cheers,
Ben.

2011-06-02 00:16:30

by Christian Kujau

[permalink] [raw]
Subject: Re: 3.0-rc1: powerpc hangs at Kernel virtual memory layout

On Tue, 31 May 2011 at 16:50, Christian Kujau wrote:
> trying to boot 3.0-rc1 on powerpc32 only progresses until:
>
> > Kernel virtual memory layout:
> > * 0xfffcf000..0xfffff000 : fixmap

After hours (and hours!) of git-bisecting, it said:

-----------------------
ccc7c28af205888798b51b6cbc0b557ac1170a49 is the first bad commit
commit ccc7c28af205888798b51b6cbc0b557ac1170a49
Author: Rafał Miłecki <[email protected]>
Date: Fri Apr 1 13:26:52 2011 +0200

ssb: pci: implement serdes workaround

Signed-off-by: Rafał Miłecki <[email protected]>
Signed-off-by: John W. Linville <[email protected]>
-----------------------

When I reverted this one from the gi-bisected tree, the box continued to
boot (until it got stuck again during IDE/CDROM init, but that may be a
different story). I'l; try to revert this from a vanilla 3.0-rc1 and see
if it helps

Thanks,
Christian.

Full gist-bisect-log: http://nerdbynature.de/bits/3.0-rc1/

> And then the system hangs, does not respond to keyboard (sysrq does not
> seem to work on this PowerBook G4). But after a while the system reboots
> itself, so I guess the machine panicked but did not print anything on the
> screen.
>
> Full messages (picture), config & (working) dmesg:
>
> http://nerdbynature.de/bits/3.0-rc1/
>
--
BOFH excuse #406:

Bad cafeteria food landed all the sysadmins in the hospital.

2011-06-02 00:47:33

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: 3.0-rc1: powerpc hangs at Kernel virtual memory layout

On Wed, 2011-06-01 at 17:16 -0700, Christian Kujau wrote:
> ccc7c28af205888798b51b6cbc0b557ac1170a49 is the first bad commit
> commit ccc7c28af205888798b51b6cbc0b557ac1170a49
> Author: Rafał Miłecki <[email protected]>
> Date: Fri Apr 1 13:26:52 2011 +0200
>
> ssb: pci: implement serdes workaround
>
> Signed-off-by: Rafał Miłecki <[email protected]>
> Signed-off-by: John W. Linville <[email protected]>
> -----------------------
>
> When I reverted this one from the gi-bisected tree, the box continued to
> boot (until it got stuck again during IDE/CDROM init, but that may be a
> different story). I'l; try to revert this from a vanilla 3.0-rc1 and see
> if it helps

Thanks. I'll have a look later today. As for the IDE/CDROM init, have
you tried the very latest linus snapshot ? Does that still happens ?
What kind of error do you observe ?

There was some time during the 3.0 merge window process when interrupts
were broken on some PowerBooks, but that should be fixed now.

Cheers,
Ben.

2011-06-02 02:58:31

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: 3.0-rc1: powerpc hangs at Kernel virtual memory layout

On Wed, 2011-06-01 at 17:16 -0700, Christian Kujau wrote:
> On Tue, 31 May 2011 at 16:50, Christian Kujau wrote:
> > trying to boot 3.0-rc1 on powerpc32 only progresses until:
> >
> > > Kernel virtual memory layout:
> > > * 0xfffcf000..0xfffff000 : fixmap
>
> After hours (and hours!) of git-bisecting, it said:
>
> -----------------------
> ccc7c28af205888798b51b6cbc0b557ac1170a49 is the first bad commit
> commit ccc7c28af205888798b51b6cbc0b557ac1170a49
> Author: Rafał Miłecki <[email protected]>
> Date: Fri Apr 1 13:26:52 2011 +0200
>
> ssb: pci: implement serdes workaround
>
> Signed-off-by: Rafał Miłecki <[email protected]>
> Signed-off-by: John W. Linville <[email protected]>
> -----------------------

Ok, thanks a lot, It looks rather trivial actually: That new workaround
is PCIe specific but is called unconditionally, and will do bad things
non-PCIe implementations.

John, care to send the patch below to Linus ASAP ? I could reproduce and
verify it fixes it. Thanks !

ssb: pci: Don't call PCIe specific workarounds on PCI cores

Otherwise it can/will crash....

Signed-off-by: Benjamin Herrenschmidt <[email protected]>
---

diff --git a/drivers/ssb/driver_pcicore.c b/drivers/ssb/driver_pcicore.c
index 82feb34..eddf1b9 100644
--- a/drivers/ssb/driver_pcicore.c
+++ b/drivers/ssb/driver_pcicore.c
@@ -540,7 +540,8 @@ void ssb_pcicore_init(struct ssb_pcicore *pc)
ssb_pcicore_init_clientmode(pc);

/* Additional always once-executed workarounds */
- ssb_pcicore_serdes_workaround(pc);
+ if (dev->id.coreid == SSB_DEV_PCIE)
+ ssb_pcicore_serdes_workaround(pc);
/* TODO: ASPM */
/* TODO: Clock Request Update */
}

2011-06-02 03:06:50

by Christian Kujau

[permalink] [raw]
Subject: Re: 3.0-rc1: powerpc hangs at Kernel virtual memory layout

On Thu, 2 Jun 2011 at 12:57, Benjamin Herrenschmidt wrote:
> Ok, thanks a lot, It looks rather trivial actually: That new workaround
> is PCIe specific but is called unconditionally, and will do bad things
> non-PCIe implementations.

Indeed. This PowerBook G4 does not has PCIe, yet the whole SSB thingy gets
enabled in my .config somehow. Thanks for the quick fix, I tried to revert
ccc7c28af2... from Linus' current tree, but I had to rip out some more to
make it compile.

I'll try your fix in a minute and get back to you with those cdrom init
problems as well.

Thanks,
Christian.
--
BOFH excuse #166:

/pub/lunch

2011-06-02 04:27:16

by Christian Kujau

[permalink] [raw]
Subject: Re: 3.0-rc1: powerpc hangs at Kernel virtual memory layout

On Thu, 2 Jun 2011 at 12:57, Benjamin Herrenschmidt wrote:
> Ok, thanks a lot, It looks rather trivial actually: That new workaround
> is PCIe specific but is called unconditionally, and will do bad things
> non-PCIe implementations.

OK, with your patch applied to Linus' latest git tree the machine
continues to boot. Also, with the latest tree, the "machine is stuck after
ide-cd init" problem[0] went away.

For this particular problem and patch, feel free to add:

Tested-by: Christian Kujau <[email protected]>

However, shortly after boot and loggin in to the box remotely, the bux did
not respond any more. I'm not sure if these are related to those SSB/PCIe
changes, but somehow I hope they are - bisecting those would take much
longer, as it's not an "instant" death:

* http://nerdbynature.de/bits/3.0-rc1/linux-3.0-rc1_stuck1.jpg
* http://nerdbynature.de/bits/3.0-rc1/linux-3.0-rc1_stuck2.jpg

This is what an OCR program made of it:

irq euent stamp: 185804850
hardirqs last enabled at (185904849): [<c04005b0>] _raw_spin_unlock_irqrestore+0x40/0x?e
hardirqs last disabled at (185904850): [<c00120b8>] reenable_mmu+0x24/0x78
Softirqs last enabled at (185892414): [<c000fe8c>] call_do_softirq+0x14/0x24
softirqs last disabled at (18589240?): [<c000fe8c>] call_do_softirq+0x14/0x24
NIP: e04005b4 LR: e04005b0 CTR: 00000000
REGS: ef92be10 TRHP: 0901 Not tainted (3.0.0-rel-00049-g1fa?b6a-dirtg)
MSB: 00009032 <EE.ME.IR.DR> CR: 42002084
TRSK = ef8d0000[38B] ’kuorker/0:2’ THREAD:
GPR00: c04005b0 ef92bec0 efBd0000 00000001
GPR08: 00000000 0b14aed0 0049a306 00030600
HIP [c01005b1] _rau_spin_unlock_irqrestore+0x44/0x?c
LR [c04005b0] _rau_spin_unlock_irqrestore+0x40/0x?c
Call Trace:
[ef92bec0] [c04005b0] _raw_spin_unlock_irqrestore+0x40/0x?c (unreliable)
[ef92bed0] [c029c504] flush_tu_ldisc+0x121/0x230
[ef92bf10] [c001c86c] process_one_uork+0x1c1/0x4cB
[ef92bfS0] [c004efac] worker_thread+0x1?8/0x3c1
[ef92bf90] [c0051148] kthread+0x81/0x88
[ef92hff0] [c0810390] kernel_thread+0x1c/0x68

XER: 20000000
ef92a000 ef8d0660 00000006 00000000 18614000 22002088
Instruction dump:
??? 93e1060c ?c9f23?B 38800001 90010011 4bc6e9a9 ?fc3i`3?8 4be61a69
?3e08080 11820021 1bc6b515 ?fe00124
B8c16008 ?c0803a6 83c1000c

Well, the picture is way better :-\

Thanks,
Christian.

[0] http://nerdbynature.de/bits/3.0-rc1/linux-3.0-rc1-cdrom.jpg
--
BOFH excuse #399:

We are a 100% Microsoft Shop.

2011-06-02 06:00:22

by Rafał Miłecki

[permalink] [raw]
Subject: Re: 3.0-rc1: powerpc hangs at Kernel virtual memory layout

2011/6/2 Christian Kujau <[email protected]>:
> On Tue, 31 May 2011 at 16:50, Christian Kujau wrote:
>> trying to boot 3.0-rc1 on powerpc32 only progresses until:
>>
>>   > Kernel virtual memory layout:
>>   >   * 0xfffcf000..0xfffff000  : fixmap
>
> After hours (and hours!) of git-bisecting, it said:
>
> -----------------------
> ccc7c28af205888798b51b6cbc0b557ac1170a49 is the first bad commit
> commit ccc7c28af205888798b51b6cbc0b557ac1170a49
> Author: Rafał Miłecki <[email protected]>
> Date:   Fri Apr 1 13:26:52 2011 +0200
>
>    ssb: pci: implement serdes workaround
>
>    Signed-off-by: Rafał Miłecki <[email protected]>
>    Signed-off-by: John W. Linville <[email protected]>
> -----------------------

I'm for the problem :(

Patch was already send yesterday, I've even CCed linuxppc-dev:
[RFT][PATCH 3.0] ssb: fix PCI(e) driver regression causing oops on PCI cards

--
Rafał

2011-06-02 06:07:21

by Rafał Miłecki

[permalink] [raw]
Subject: Re: 3.0-rc1: powerpc hangs at Kernel virtual memory layout

On Tue, 31 May 2011 at 16:50, Christian Kujau wrote:
> trying to boot 3.0-rc1 on powerpc32 only progresses until:
>
>   > Kernel virtual memory layout:
>   >   * 0xfffcf000..0xfffff000  : fixmap

The weird thing is that:

1) You didn't see (like Andres):
Machine check in kernel mode.
Caused by (from SRR1=149030): Transfer error ack signal
Oops: Machine check, sig: 7 [#1]
But, OK, maybe machine check requires something additional in kernel,
I don't know...

2) You didn't see SSB messages
This is confusing. You should see SSB messages that appear before my
invalid read happens. Did you somehow disable most of the important
logs, or sth? Having ssb messages and the end of hung boot would
directly point you to ssb module.

--
Rafał

2011-06-02 06:17:03

by Christian Kujau

[permalink] [raw]
Subject: Re: 3.0-rc1: powerpc hangs at Kernel virtual memory layout

On Thu, 2 Jun 2011 at 08:07, Rafał Miłecki wrote:
> 1) You didn't see (like Andres):
> Machine check in kernel mode.
> Caused by (from SRR1=149030): Transfer error ack signal
> Oops: Machine check, sig: 7 [#1]
> But, OK, maybe machine check requires something additional in kernel,
> I don't know...
>
> 2) You didn't see SSB messages
> This is confusing. You should see SSB messages that appear before my
> invalid read happens. Did you somehow disable most of the important
> logs, or sth? Having ssb messages and the end of hung boot would
> directly point you to ssb module.

BenH advised to boot with udbg-immortal and out came:

http://nerdbynature.de/bits/3.0-rc1/linux-3.0_powerpc_2.jpg
http://nerdbynature.de/bits/3.0-rc1/linux-3.0_powerpc_2.mp4
(watch it at very slow speed, as it's only 3sec long)

I've enabled[0] FB_NVIDIA and during normal booting the screen flickers
after the "... : fixmap" message and the screen clears and is filled again
from the top - maybe the messages would've been there if booted w/o the
framebuffer enabled.

Right now I'm happy that Ben's fix helped to get past this message, but
the system remains unsuable[1] with the latest -git, but more debugging
has to wait until tomorrow...

Thanks,
Christian.

[0] http://nerdbynature.de/bits/3.0-rc1/config-2.6.39.txt
[1] https://lkml.org/lkml/2011/6/2/6
--
BOFH excuse #230:

Lusers learning curve appears to be fractal

2011-06-02 07:33:56

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: 3.0-rc1: powerpc hangs at Kernel virtual memory layout

On Wed, 2011-06-01 at 21:27 -0700, Christian Kujau wrote:
> On Thu, 2 Jun 2011 at 12:57, Benjamin Herrenschmidt wrote:
> > Ok, thanks a lot, It looks rather trivial actually: That new workaround
> > is PCIe specific but is called unconditionally, and will do bad things
> > non-PCIe implementations.
>
> OK, with your patch applied to Linus' latest git tree the machine
> continues to boot. Also, with the latest tree, the "machine is stuck after
> ide-cd init" problem[0] went away.
>
> For this particular problem and patch, feel free to add:
>
> Tested-by: Christian Kujau <[email protected]>
>
> However, shortly after boot and loggin in to the box remotely, the bux did
> not respond any more. I'm not sure if these are related to those SSB/PCIe
> changes, but somehow I hope they are - bisecting those would take much
> longer, as it's not an "instant" death:
>
> * http://nerdbynature.de/bits/3.0-rc1/linux-3.0-rc1_stuck1.jpg
> * http://nerdbynature.de/bits/3.0-rc1/linux-3.0-rc1_stuck2.jpg
>
> This is what an OCR program made of it:

I think this is another problem that I'm in the middle of trying to
figure out.

It -looks- to me that something goes wrong in the tty code when a large
file is piped through a pty, causing the kernel to hang for minutes in
the workqueue / ldisk flush code. I've just sent an initial report to
Alan Cox about it and am currently bisecting it.

Cheers,
Ben.

> irq euent stamp: 185804850
> hardirqs last enabled at (185904849): [<c04005b0>] _raw_spin_unlock_irqrestore+0x40/0x?e
> hardirqs last disabled at (185904850): [<c00120b8>] reenable_mmu+0x24/0x78
> Softirqs last enabled at (185892414): [<c000fe8c>] call_do_softirq+0x14/0x24
> softirqs last disabled at (18589240?): [<c000fe8c>] call_do_softirq+0x14/0x24
> NIP: e04005b4 LR: e04005b0 CTR: 00000000
> REGS: ef92be10 TRHP: 0901 Not tainted (3.0.0-rel-00049-g1fa?b6a-dirtg)
> MSB: 00009032 <EE.ME.IR.DR> CR: 42002084
> TRSK = ef8d0000[38B] ’kuorker/0:2’ THREAD:
> GPR00: c04005b0 ef92bec0 efBd0000 00000001
> GPR08: 00000000 0b14aed0 0049a306 00030600
> HIP [c01005b1] _rau_spin_unlock_irqrestore+0x44/0x?c
> LR [c04005b0] _rau_spin_unlock_irqrestore+0x40/0x?c
> Call Trace:
> [ef92bec0] [c04005b0] _raw_spin_unlock_irqrestore+0x40/0x?c (unreliable)
> [ef92bed0] [c029c504] flush_tu_ldisc+0x121/0x230
> [ef92bf10] [c001c86c] process_one_uork+0x1c1/0x4cB
> [ef92bfS0] [c004efac] worker_thread+0x1?8/0x3c1
> [ef92bf90] [c0051148] kthread+0x81/0x88
> [ef92hff0] [c0810390] kernel_thread+0x1c/0x68
>
> XER: 20000000
> ef92a000 ef8d0660 00000006 00000000 18614000 22002088
> Instruction dump:
> ??? 93e1060c ?c9f23?B 38800001 90010011 4bc6e9a9 ?fc3i`3?8 4be61a69
> ?3e08080 11820021 1bc6b515 ?fe00124
> B8c16008 ?c0803a6 83c1000c
>
> Well, the picture is way better :-\
>
> Thanks,
> Christian.
>
> [0] http://nerdbynature.de/bits/3.0-rc1/linux-3.0-rc1-cdrom.jpg

2011-06-06 02:11:22

by Christian Kujau

[permalink] [raw]
Subject: Re: 3.0-rc1: powerpc hangs at Kernel virtual memory layout

On Thu, 2 Jun 2011 at 17:33, Benjamin Herrenschmidt wrote:
> It -looks- to me that something goes wrong in the tty code when a large
> file is piped through a pty, causing the kernel to hang for minutes in
> the workqueue / ldisk flush code. I've just sent an initial report to
> Alan Cox about it and am currently bisecting it.

This was the "tty vs workqueue oddities" thread, right? FWIW,
55db4c64eddf37 ("Revert "tty: make receive_buf() return the amout of bytes
received"") seems to have fixed it on this powerpc machine as well.

With your "ssb: pci: Don't call PCIe specific workarounds on PCI cores"
patch applied, powerpc32 seems to be quite happy with 3.0-rc1+

Thanks,
Christian.
--
BOFH excuse #382:

Someone was smoking in the computer room and set off the halon systems.

2011-06-06 03:47:31

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: 3.0-rc1: powerpc hangs at Kernel virtual memory layout

On Sun, 2011-06-05 at 19:11 -0700, Christian Kujau wrote:
> On Thu, 2 Jun 2011 at 17:33, Benjamin Herrenschmidt wrote:
> > It -looks- to me that something goes wrong in the tty code when a large
> > file is piped through a pty, causing the kernel to hang for minutes in
> > the workqueue / ldisk flush code. I've just sent an initial report to
> > Alan Cox about it and am currently bisecting it.
>
> This was the "tty vs workqueue oddities" thread, right? FWIW,
> 55db4c64eddf37 ("Revert "tty: make receive_buf() return the amout of bytes
> received"") seems to have fixed it on this powerpc machine as well.

Yup.

> With your "ssb: pci: Don't call PCIe specific workarounds on PCI cores"
> patch applied, powerpc32 seems to be quite happy with 3.0-rc1+

Good :-)

Cheers,
Ben.

2011-06-10 22:54:38

by Christian Kujau

[permalink] [raw]
Subject: Re: 3.0-rc1: powerpc hangs at Kernel virtual memory layout

On Thu, 2 Jun 2011 at 12:57, Benjamin Herrenschmidt wrote:
> John, care to send the patch below to Linus ASAP ? I could reproduce and
> verify it fixes it. Thanks !
>
> ssb: pci: Don't call PCIe specific workarounds on PCI cores
>
> Otherwise it can/will crash....

The patch did not make it into -rc2, it's not in today's git tree either,
AFAICS. Can anyone push this, please?

Thanks,
Christian.

> Signed-off-by: Benjamin Herrenschmidt <[email protected]>
> ---
>
> diff --git a/drivers/ssb/driver_pcicore.c b/drivers/ssb/driver_pcicore.c
> index 82feb34..eddf1b9 100644
> --- a/drivers/ssb/driver_pcicore.c
> +++ b/drivers/ssb/driver_pcicore.c
> @@ -540,7 +540,8 @@ void ssb_pcicore_init(struct ssb_pcicore *pc)
> ssb_pcicore_init_clientmode(pc);
>
> /* Additional always once-executed workarounds */
> - ssb_pcicore_serdes_workaround(pc);
> + if (dev->id.coreid == SSB_DEV_PCIE)
> + ssb_pcicore_serdes_workaround(pc);
> /* TODO: ASPM */
> /* TODO: Clock Request Update */
> }
>
--
BOFH excuse #312:

incompatible bit-registration operators

2011-06-10 22:59:13

by Rafał Miłecki

[permalink] [raw]
Subject: Re: 3.0-rc1: powerpc hangs at Kernel virtual memory layout

2011/6/11 Christian Kujau <[email protected]>:
> On Thu, 2 Jun 2011 at 12:57, Benjamin Herrenschmidt wrote:
>> John, care to send the patch below to Linus ASAP ? I could reproduce and
>> verify it fixes it. Thanks !
>>
>> ssb: pci: Don't call PCIe specific workarounds on PCI cores
>>
>> Otherwise it can/will crash....
>
> The patch did not make it into -rc2, it's not in today's git tree either,
> AFAICS. Can anyone push this, please?

Yeah, I noticed it wasn't in the pull for rc2. I pinged John, he told
me to just wait.

Patch was taken with the recent pull, it should go into rc3.

--
Rafał