2002-07-18 12:26:23

by Martin Devera

[permalink] [raw]
Subject: 2.4.18 is not SMP friendly

Hello,

I someone here running 2.4.18 on PII SMP successfully ?
My SMP box was happily running 2.4.3 but after upgrade
to 2.4.18 I got 3 oopses in 4 days.
All was FS related, one during heavy access to SCSI and
IDE in paralel (I post ksymoops output recently but nobody
seemed interested) ane during cdrecord running in paralel
with SCSI HDD (IDE cdwritter) and latest when trying to
mount IDE ZIP drive with corrupted ZIP floppy. Latest
resulted in system panic and freeze so no output here :(

This is like scream into dark because I rebooted with
maxcpus=1 and it seems to be ok now and I don't want to
experiment with production server anymore.
But is someone knows the problem I'm willing to test some
patches, hacks .. etc
Seems to me like missing spinlock somewhere ..

thanks,
devik


2002-07-18 12:31:34

by Alan

[permalink] [raw]
Subject: Re: 2.4.18 is not SMP friendly

On Thu, 2002-07-18 at 11:51, devik wrote:
> I someone here running 2.4.18 on PII SMP successfully ?

PPro in my case but yes. 2.4.18 ought to be pretty solid except for some
annoying bugs you'll only hit if you use smbfs.

2002-07-18 13:14:15

by Kelledin

[permalink] [raw]
Subject: Re: 2.4.18 is not SMP friendly

On Thursday 18 July 2002 08:45 am, Alan Cox wrote:
> On Thu, 2002-07-18 at 11:51, devik wrote:
> > I someone here running 2.4.18 on PII SMP successfully ?
>
> PPro in my case but yes. 2.4.18 ought to be pretty solid
> except for some annoying bugs you'll only hit if you use
> smbfs.

I, too, am running a dual PPro box on 2.4.18. It's been solid
from the get-go.

By the way, what are these bugs with smbfs? I haven't hit them
on my dual ppro box, probably because the box never runs as a
samba client (just a samba server).

--
Kelledin
"If a server crashes in a server farm and no one pings it, does
it still cost four figures to fix?"

2002-07-18 13:34:15

by Richard Ems

[permalink] [raw]
Subject: Re: 2.4.18 is not SMP friendly


Running on 2 x Pentium III (Coppermine) (700 and 1000 Mhz) works also
ok.

Kernels are SuSE's k_smp-2.4.18-183, that means 2.4.19aa... + more SuSE
patches.

Sorry, no 2 x Pentium II.

--
Richard Ems
... e-mail: [email protected]
... Computer Science, University of Hamburg

Unix IS user friendly. It's just selective about who its friends are.

2002-07-18 13:48:37

by Tommy Faasen

[permalink] [raw]
Subject: Re: 2.4.18 is not SMP friendly

> On Thu, 2002-07-18 at 11:51, devik wrote:
>> I someone here running 2.4.18 on PII SMP successfully ?
>
No problems on my side, on 2.4.18 and 2.4.18-wolk-3.5rc3.
> PPro in my case but yes. 2.4.18 ought to be pretty solid except for some
> annoying bugs you'll only hit if you use smbfs.
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel"
> in the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/



2002-07-18 14:09:11

by mbs

[permalink] [raw]
Subject: Re: 2.4.18 is not SMP friendly

I've had problems w/P4 SMP on 2.4.18 and RH2.4.18-3 where after a while
(30-40 min after boot) it would slow to a crawl, and the disk would be
constantly going, but CPU usage would be ~0%. with 2 gigs of RAM and nothing
running (not even x)....

RH2.4.18-5 does not seem to have the problem.

On Thursday 18 July 2002 09:53, Tommy Faasen wrote:
> > On Thu, 2002-07-18 at 11:51, devik wrote:
> >> I someone here running 2.4.18 on PII SMP successfully ?
>
> No problems on my side, on 2.4.18 and 2.4.18-wolk-3.5rc3.
>
> > PPro in my case but yes. 2.4.18 ought to be pretty solid except for some
> > annoying bugs you'll only hit if you use smbfs.
> >
> > -
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel"
> > in the body of a message to [email protected]
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at http://www.tux.org/lkml/
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

--
/**************************************************
** Mark Salisbury || [email protected] **
** If you would like to sponsor me for the **
** Mass Getaway, a 150 mile bicycle ride to for **
** MS, contact me to donate by cash or check or **
** click the link below to donate by credit card **
**************************************************/
https://www.nationalmssociety.org/pledge/pledge.asp?participantid=86736

2002-07-18 14:07:27

by J.A. Magallon

[permalink] [raw]
Subject: Re: 2.4.18 is not SMP friendly


On 2002.07.18 devik wrote:
>Hello,
>
>I someone here running 2.4.18 on PII SMP successfully ?
>My SMP box was happily running 2.4.3 but after upgrade
>to 2.4.18 I got 3 oopses in 4 days.

Solid as a rock on dual PII@400. Anso on a Dual Xeon and on a bunch of
dual PIII boxes. Even I run jam kernels built with gcc3.1.1, but when
I get into trouble 2.4.18 is there.

--
J.A. Magallon \ Software is like sex: It's better when it's free
mailto:[email protected] \ -- Linus Torvalds, FSF T-shirt
Linux werewolf 2.4.19-rc2-jam1, Mandrake Linux 8.3 (Cooker) for i586
gcc (GCC) 3.1.1 (Mandrake Linux 8.3 3.1.1-0.7mdk)

2002-07-18 14:16:04

by Keith Driscoll

[permalink] [raw]
Subject: Re: 2.4.18 is not SMP friendly

I've been running a stock 2.4.18 on this machine (2 x P-II 300) since May(?)
with no problems. Unfortunately, I didn't see the original message (or don't
rember it), so I don't know what the OPs question/problem was.

Keith.
>
>
> Running on 2 x Pentium III (Coppermine) (700 and 1000 Mhz) works also
> ok.
>
> Kernels are SuSE's k_smp-2.4.18-183, that means 2.4.19aa... + more SuSE
> patches.
>
> Sorry, no 2 x Pentium II.
>
> --
> Richard Ems
> ... e-mail: [email protected]
> ... Computer Science, University of Hamburg
>
> Unix IS user friendly. It's just selective about who its friends are.
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>


--
|Keith Driscoll | Quis custodiet ipsos custodes? |
|[email protected] | All opinions are MINE, MINE, MINE, |
| | and I refuse to share them with |
| GO BUCKS!!! | anyone, so there! TPPPHT!!!!! |

2002-07-18 15:07:54

by Mika Liljeberg

[permalink] [raw]
Subject: SMP & MCE [Was: 2.4.18 is not SMP friendly]

On Thu, 2002-07-18 at 13:51, devik wrote:
> Hello,
>
> I someone here running 2.4.18 on PII SMP successfully ?
> My SMP box was happily running 2.4.3 but after upgrade
> to 2.4.18 I got 3 oopses in 4 days.

2 x PII (Deschutes, dA0 core). So far so good, uptime nearly 2 days now.
In fact, I'm starting to have a glimmer of hope that I might finally
have licked (fingers crossed) a really ugly system freeze problem which
has been bugging me ever since I moved on from 2.4.0-test9 [solid freeze
in less than 24 hours, on average]. I have tried numerous kernels after
that, none of them helped. Not one.

Well, a few days ago I got a Machine Check Exception in the log file,
basically complaining about a catastrophic memory system inconsistency.
First time I ever saw this, despite hundreds of lockups. I thought,
whaddaya know, maybe it really is a hardware problem.

So how come 2.4.0-test9 and older kernels appear to work ok?

[You might ask why I'm not running a kernel that I know is more stable.
Well, my home system is not that important and I've sort of learned to
live with the lockups. I usually shut it down for the night, so the
average uptime is good enough most days. It really is no worse than
trying to run Win98, and ext3 does help a lot.]

Anyway, I had already resigned to my fate, but now I decided to
investigate again. It turns out that Machine Check Exceptions were, for
the very first time, enabled by default in 2.4.0-test10. Also, it turns
out that the PII has a surprising number of Errata related to SMP and
MCEs. Almost all of them lead to a catastrophic failure and CPU
shutdown. Correct execution of the MCE handler is not guaranteed either.
Exactly the kind of behaviour I have been seeing. Coincidence? Maybe.
It's the only hypothesis I've got, so I'm putting it to the test.

According to the PII errata, some of the lockups could be eliminated by
simply not enabling MCE at all. Unfortunately, this is not true for all
of them. Besides, there appear to be other SMP related ones that are
really ugly and completely unrelated to MCE. The worst of the errata
could, however, be worked around with a BIOS patch (i.e., microcode
update). Fat chance. It turns out my mobo vendor never bothered to put
most of the IA32 microcode updates into the BIOS (thanks a lot
Giga-Byte!).

Anyway, I'm now running 2.4.18 with the machine check exceptions
disabled. I've also compiled the microcode upgrade driver into the
kernel and upgrade the microcode on both CPUs during Linux boot. Maybe
it helps.

I hope this tirade is useful to someone who is suffering from mysterious
lockups or strange MCEs. Mostly I'm just happy that I have finished it
and my machine is still running.

Cheers,

MikaL


2002-07-18 15:20:38

by Urban Widmark

[permalink] [raw]
Subject: Re: 2.4.18 is not SMP friendly

On Thu, 18 Jul 2002, Kelledin wrote:

> By the way, what are these bugs with smbfs? I haven't hit them
> on my dual ppro box, probably because the box never runs as a
> samba client (just a samba server).

If you have characters in the filenames that doesn't match the charset you
use locally it will end up thinking that the name is 0xffffffff long and
oops when it tries to access beyond the mapped memory. The old code just
put a ? in the string.

/Urban

2002-07-18 15:33:01

by Chris Ricker

[permalink] [raw]
Subject: Re: 2.4.18 is not SMP friendly

On 18 Jul 2002, Alan Cox wrote:

> On Thu, 2002-07-18 at 11:51, devik wrote:
> > I someone here running 2.4.18 on PII SMP successfully ?
>
> PPro in my case but yes. 2.4.18 ought to be pretty solid except for some
> annoying bugs you'll only hit if you use smbfs.

Or if you use data=journal w/ ext3....

later,
chris


2002-07-18 16:10:06

by Martin Devera

[permalink] [raw]
Subject: Re: 2.4.18 is not SMP friendly

Hi,
Yes I use smbfs.
Regarding my oops report, is there known
bug where waitqueue would be corrupted ? When I analyzed
it I found that invalid address 8bd4189c was loaded from
tasklist pointer in wait_queue_head_t (sched.c, __wake_up_common
line "p = curr->task").
The wakeup was called from get_new_inode and seems like
if list of tasks was not initialized of what :(

thanks, devik

On 18 Jul 2002, Alan Cox wrote:

> On Thu, 2002-07-18 at 11:51, devik wrote:
> > I someone here running 2.4.18 on PII SMP successfully ?
>
> PPro in my case but yes. 2.4.18 ought to be pretty solid except for some
> annoying bugs you'll only hit if you use smbfs.
>
>