2002-01-12 22:02:36

by Andreas Haumer

[permalink] [raw]
Subject: Linux-2.2.20 SMP & Asus CUR-DLS: "stuck on TLB IPI wait (CPU#3)"

Hi!

I'm seeing a problem with SMP Linux-2.2.20 on an ASUS CUR-DLS
motherboard. I noticed there were similar reports in the
past few months and I got the impression the problem should
already be fixed in 2.2.20, but seemingly it isn't.

We have a fileserver which was running Linux-2.2.18 without
a single problem for about 8 months or so. It has an Asus CUR-DLS
SMP motherboard with Server Works chipset and Asus Medallion
CUR-DLS ACPI BIOS revision 1009, dual Intel PIII CPU (866MHz)
and 512MB registered PC133 SDRAM.
There is also an Adaptec 29160 U160 SCSI controller and a
3Com 3c905B NIC in this server.

Today I upgraded to Linux-2.2.20, and after reboot the
system became very slow, every now and then it halted for
3 or 4 seconds, and the kernel printed a lot of these
messages: "stuck on TLB IPI wait (CPU#3)"

I then rebooted, disabled BIOS MPS 1.4 support, but this
didn't help. I had to boot with "noapic" option in order
to get a system running in a sane way.

I have to say that this is not a pristine Linux-2.2.20
kernel, as we included the following patches:

devfs-v99.21
aic7xxx-6.2.4
sw-raid-2.2.20-A0

but the same kernel is running on an Asus CUV4X-D SMP
(dual PIII 1GHz CPU) system (VIA chipset) without any
problem.

I found several mails on lkml reporting similar problems,
but no one reported them for this motherboard.

A few more infos from the system running with "noapic":

root@schiller:~ {194} $ lspci -v
00:00.0 Host bridge: Relience Computer CNB20HE (rev 05)
Flags: bus master, medium devsel, latency 32

00:00.1 Host bridge: Relience Computer CNB20HE (rev 05)
Flags: bus master, medium devsel, latency 48

00:05.0 Ethernet controller: 3Com Corporation 3c905B 100BaseTX
[Cyclone] (rev 30)
Subsystem: 3Com Corporation: Unknown device 9055
Flags: bus master, medium devsel, latency 32, IRQ 11
I/O ports at d800
Memory at fe000000 (32-bit, non-prefetchable)
Capabilities: [dc] Power Management version 1

00:07.0 VGA compatible controller: ATI Technologies Inc: Unknown
device 4752 (rev 27) (prog-if 00 [VGA])
Subsystem: Asustek Computer, Inc.: Unknown device 802b
Flags: bus master, stepping, medium devsel, latency 32
Memory at fd000000 (32-bit, non-prefetchable)
I/O ports at f000
Memory at fc800000 (32-bit, non-prefetchable)
Expansion ROM at febc0000 [disabled]
Capabilities: [5c] Power Management version 2

00:0f.0 ISA bridge: Relience Computer: Unknown device 0200 (rev 50)
Subsystem: Relience Computer: Unknown device 0200
Flags: bus master, medium devsel, latency 0

00:0f.1 IDE interface: Relience Computer: Unknown device 0211 (prog-if
8a [Master SecP PriP])
Flags: bus master, medium devsel, latency 32
I/O ports at d000

01:03.0 SCSI storage controller: Adaptec 7892A (rev 02)
Subsystem: Adaptec: Unknown device e2a0
Flags: bus master, 66Mhz, medium devsel, latency 32, IRQ 10
BIST result: 00
I/O ports at b800 [disabled]
Memory at fb800000 (64-bit, non-prefetchable)
Capabilities: [dc] Power Management version 2

01:05.0 SCSI storage controller: Symbios Logic Inc. (formerly NCR):
Unknown device 0020 (rev 01)
Flags: bus master, medium devsel, latency 72, IRQ 15
I/O ports at b400
Memory at fb000000 (64-bit, non-prefetchable)
Memory at fa800000 (64-bit, non-prefetchable)
Capabilities: [40] Power Management version 2

01:05.1 SCSI storage controller: Symbios Logic Inc. (formerly NCR):
Unknown device 0020 (rev 01)
Flags: bus master, medium devsel, latency 72, IRQ 5
I/O ports at b000
Memory at fa000000 (64-bit, non-prefetchable)
Memory at f9800000 (64-bit, non-prefetchable)
Capabilities: [40] Power Management version 2

root@schiller:~ {195} $ cat /proc/interrupts
CPU0 CPU1
0: 177781 0 XT-PIC timer
1: 1153 0 XT-PIC keyboard
2: 0 0 XT-PIC cascade
4: 21 0 XT-PIC serial
8: 3 0 XT-PIC rtc
10: 740596 0 XT-PIC aic7xxx
11: 707791 0 XT-PIC eth0
13: 1 0 XT-PIC fpu
NMI: 0
ERR: 0

root@schiller:~ {196} $ lsmod
Module Size Used by
nfsd 183840 16 (autoclean)
nfs 72800 1 (autoclean)
lockd 47248 1 (autoclean) [nfsd nfs]
sunrpc 66192 1 (autoclean) [nfsd nfs lockd]
3c59x 21744 1 (autoclean)
softdog 1584 1 (autoclean)
eeprom 3072 0 (unused)
w83781d 17184 0 (unused)
i2c-piix4 3616 0 (unused)
i2c-proc 5984 0 [eeprom w83781d]
i2c-core 12640 0 [eeprom w83781d i2c-piix4 i2c-proc]
raid5 19024 1 (autoclean)
unix 12400 18 (autoclean)
aic7xxx 111696 11
sd_mod 17600 11
scsi_mod 63680 2 [aic7xxx sd_mod]
ext2 42320 7

Any idea anyone?
I would be glad if I could help fixing this problem.

- andreas

--
Andreas Haumer | mailto:[email protected]
*x Software + Systeme | http://www.xss.co.at/
Karmarschgasse 51/2/20 | Tel: +43-1-6060114-0
A-1100 Vienna, Austria | Fax: +43-1-6060114-71


2002-01-12 22:34:57

by Benjamin LaHaise

[permalink] [raw]
Subject: Re: Linux-2.2.20 SMP & Asus CUR-DLS: "stuck on TLB IPI wait (CPU#3)"

On Sat, Jan 12, 2002 at 11:02:16PM +0100, Andreas Haumer wrote:
> Hi!
>
> I'm seeing a problem with SMP Linux-2.2.20 on an ASUS CUR-DLS
> motherboard. I noticed there were similar reports in the
> past few months and I got the impression the problem should
> already be fixed in 2.2.20, but seemingly it isn't.

This bug is fixed in 2.4.

-ben

2002-01-12 23:02:40

by Andreas Haumer

[permalink] [raw]
Subject: Re: Linux-2.2.20 SMP & Asus CUR-DLS: "stuck on TLB IPI wait (CPU#3)"

Benjamin LaHaise wrote:
>
> On Sat, Jan 12, 2002 at 11:02:16PM +0100, Andreas Haumer wrote:
> > Hi!
> >
> > I'm seeing a problem with SMP Linux-2.2.20 on an ASUS CUR-DLS
> > motherboard. I noticed there were similar reports in the
> > past few months and I got the impression the problem should
> > already be fixed in 2.2.20, but seemingly it isn't.
>
> This bug is fixed in 2.4.
>
Aha!

Anyone working on backporting it to 2.2.21?
Alan?

- andreas

--
Andreas Haumer | mailto:[email protected]
*x Software + Systeme | http://www.xss.co.at/
Karmarschgasse 51/2/20 | Tel: +43-1-6060114-0
A-1100 Vienna, Austria | Fax: +43-1-6060114-71

2002-01-12 23:14:32

by Benjamin LaHaise

[permalink] [raw]
Subject: Re: Linux-2.2.20 SMP & Asus CUR-DLS: "stuck on TLB IPI wait (CPU#3)"

On Sun, Jan 13, 2002 at 12:01:58AM +0100, Andreas Haumer wrote:
> Aha!
>
> Anyone working on backporting it to 2.2.21?
> Alan?

That's unlikely: the improvements in smp locking are what 2.4 was all about,
so "backporting" them is basically reinventing 2.4.

-ben

2002-01-13 01:28:13

by Alan

[permalink] [raw]
Subject: Re: Linux-2.2.20 SMP & Asus CUR-DLS: "stuck on TLB IPI wait (CPU#3)"

> Anyone working on backporting it to 2.2.21?
> Alan?

2.2 does not support VIA SMP, its probably not a good kernel to choose for
the buggy VIA chipsets either.

2002-01-13 01:47:04

by Reid Hekman

[permalink] [raw]
Subject: Re: Linux-2.2.20 SMP & Asus CUR-DLS: "stuck on TLB IPI wait (CPU#3)"

On Sat, 2002-01-12 at 19:39, Alan Cox wrote:
> > Anyone working on backporting it to 2.2.21?
> > Alan?
>
> 2.2 does not support VIA SMP, its probably not a good kernel to choose for
> the buggy VIA chipsets either.

So ServerWorks (re: his Asus CUR-DLS) is right out as well?

Reid

2002-01-13 01:50:13

by Alan

[permalink] [raw]
Subject: Re: Linux-2.2.20 SMP & Asus CUR-DLS: "stuck on TLB IPI wait (CPU#3)"

> > 2.2 does not support VIA SMP, its probably not a good kernel to choose for
> > the buggy VIA chipsets either.
>
> So ServerWorks (re: his Asus CUR-DLS) is right out as well?

Serverworks I don't know. I've got reports of serverworks SMP working perfectly
well in the 2.2 tree so I don't know what the full story is there.

2002-01-13 11:46:15

by Andreas Haumer

[permalink] [raw]
Subject: Re: Linux-2.2.20 SMP & Asus CUR-DLS: "stuck on TLB IPI wait (CPU#3)"

Hi!

Alan Cox wrote:
>
> > > 2.2 does not support VIA SMP, its probably not a good kernel to choose for
> > > the buggy VIA chipsets either.
> >
> > So ServerWorks (re: his Asus CUR-DLS) is right out as well?
>
> Serverworks I don't know. I've got reports of serverworks SMP working perfectly
> well in the 2.2 tree so I don't know what the full story is there.

This board worked fine for several months under 2.2.18
I then upgraded to 2.2.20 yesterday and noticed this problem
for the first time (I didn't try 2.2.19 on it)

I still have the full 2.2.18 installation (I did the 2.2.20
installation on a separate SCA harddisk) so I can easily
switch. To see if it's a hardware problem I already switched
back to 2.2.18 once, and the problem went away.
Under 2.2.20 I have to boot with "noapic" to have it running
smoothly.

So if someone wants me to try or test something, just send
me a short note.

- andreas

--
Andreas Haumer | mailto:[email protected]
*x Software + Systeme | http://www.xss.co.at/
Karmarschgasse 51/2/20 | Tel: +43-1-6060114-0
A-1100 Vienna, Austria | Fax: +43-1-6060114-71

2002-01-13 15:28:36

by Alan

[permalink] [raw]
Subject: Re: Linux-2.2.20 SMP & Asus CUR-DLS: "stuck on TLB IPI wait (CPU#3)"

> switch. To see if it's a hardware problem I already switched
> back to 2.2.18 once, and the problem went away.
> Under 2.2.20 I have to boot with "noapic" to have it running
> smoothly.

Does 2.2.19 work ?

2002-01-13 15:55:59

by Andreas Haumer

[permalink] [raw]
Subject: Re: Linux-2.2.20 SMP & Asus CUR-DLS: "stuck on TLB IPI wait (CPU#3)"

Hi!

Alan Cox wrote:
>
> > switch. To see if it's a hardware problem I already switched
> > back to 2.2.18 once, and the problem went away.
> > Under 2.2.20 I have to boot with "noapic" to have it running
> > smoothly.
>
> Does 2.2.19 work ?

I was afraid you'd ask... ;-)

I skipped 2.2.19 and went from 2.2.18 straight to 2.2.20
(This is our fileserver, so it's not that easy to find a
time slot to reboot this machine...)

But I will try it today.

- andreas

--
Andreas Haumer | mailto:[email protected]
*x Software + Systeme | http://www.xss.co.at/
Karmarschgasse 51/2/20 | Tel: +43-1-6060114-0
A-1100 Vienna, Austria | Fax: +43-1-6060114-71

2002-01-13 16:27:28

by Andreas Haumer

[permalink] [raw]
Subject: Re: Linux-2.2.20 SMP & Asus CUR-DLS: "stuck on TLB IPI wait (CPU#3)"

00:00.0 Host bridge: Relience Computer CNB20HE (rev 05)
Flags: bus master, medium devsel, latency 32

00:00.1 Host bridge: Relience Computer CNB20HE (rev 05)
Flags: bus master, medium devsel, latency 48

00:05.0 Ethernet controller: 3Com Corporation 3c905B 100BaseTX [Cyclone] (rev 30)
Subsystem: 3Com Corporation: Unknown device 9055
Flags: bus master, medium devsel, latency 32, IRQ 18
I/O ports at d800
Memory at fe000000 (32-bit, non-prefetchable)
Capabilities: [dc] Power Management version 1

00:07.0 VGA compatible controller: ATI Technologies Inc: Unknown device 4752 (rev 27) (prog-if 00 [VGA])
Subsystem: Asustek Computer, Inc.: Unknown device 802b
Flags: bus master, stepping, medium devsel, latency 32
Memory at fd000000 (32-bit, non-prefetchable)
I/O ports at f000
Memory at fc800000 (32-bit, non-prefetchable)
Expansion ROM at febc0000 [disabled]
Capabilities: [5c] Power Management version 2

00:0f.0 ISA bridge: Relience Computer: Unknown device 0200 (rev 50)
Subsystem: Relience Computer: Unknown device 0200
Flags: bus master, medium devsel, latency 0

00:0f.1 IDE interface: Relience Computer: Unknown device 0211 (prog-if 8a [Master SecP PriP])
Flags: bus master, medium devsel, latency 32
I/O ports at d000

01:03.0 SCSI storage controller: Adaptec 7892A (rev 02)
Subsystem: Adaptec: Unknown device e2a0
Flags: bus master, 66Mhz, medium devsel, latency 32, IRQ 22
BIST result: 00
I/O ports at b800
Memory at fb800000 (64-bit, non-prefetchable)
Capabilities: [dc] Power Management version 2

01:05.0 SCSI storage controller: Symbios Logic Inc. (formerly NCR): Unknown device 0020 (rev 01)
Flags: bus master, medium devsel, latency 72, IRQ 24
I/O ports at b400
Memory at fb000000 (64-bit, non-prefetchable)
Memory at fa800000 (64-bit, non-prefetchable)
Capabilities: [40] Power Management version 2

01:05.1 SCSI storage controller: Symbios Logic Inc. (formerly NCR): Unknown device 0020 (rev 01)
Flags: bus master, medium devsel, latency 72, IRQ 25
I/O ports at b000
Memory at fa000000 (64-bit, non-prefetchable)
Memory at f9800000 (64-bit, non-prefetchable)
Capabilities: [40] Power Management version 2


Attachments:
schiller-2.2.19.dmesg (13.15 kB)
schiller-2.2.19.lspci (2.11 kB)
Download all attachments