2001-10-28 22:30:00

by Raphael Manfredi

[permalink] [raw]
Subject: 8139too on ABIT BP6 causes "eth0: transmit timed out"

I'm running:

Linux nice 2.4.12-ac3 #1 SMP Sat Oct 20 16:21:24 MEST 2001 i686 unknown

but this problem is not specific to that kernel. I've been having
it for a looong time.

Specifically, I get:

NETDEV WATCHDOG: eth0: transmit timed out
eth0: Tx queue start entry 32190531 dirty entry 32190527.
eth0: Tx descriptor 0 is 00002000.
eth0: Tx descriptor 1 is 00002000.
eth0: Tx descriptor 2 is 00002000.
eth0: Tx descriptor 3 is 00002000. (queue head)
eth0: Setting 100mbps full-duplex based on auto-negotiated partner ability 45e1.

and then the machine is dead, network-wise. I have to reboot (reset).

Note that I am on an ABIT BP6 board, and I do get a lot of APIC errors
under heavy network traffic, which is what raises the above.
By heavy network traffic, I mean a 7 Mb/s full duplex (it's a 100 Mb/s
LAN).

I suspect that somewhere, the APIC gets hosed and looses an interrupt,
and then the ethernet driver no longer processes its queue.
Is there anything that can be done to reset the hardware state more
fully when this occurs?

Raphael


2001-10-28 23:11:17

by Erich Boleyn

[permalink] [raw]
Subject: APM disable broken (was -> Re: 8139too on ABIT BP6 causes "eth0: transmit timed out" )


Raphael Manfredi <[email protected]> wrote:

...[recent 2.4-based kernel]...

> but this problem is not specific to that kernel. I've been having
> it for a looong time.
>
> Specifically, I get:
>
> NETDEV WATCHDOG: eth0: transmit timed out
...
> and then the machine is dead, network-wise. I have to reboot (reset).
>
> Note that I am on an ABIT BP6 board, and I do get a lot of APIC errors
> under heavy network traffic, which is what raises the above.
> By heavy network traffic, I mean a 7 Mb/s full duplex (it's a 100 Mb/s
> LAN).

I had what looks like exactly this problem with my ABIT BP6 -based machine
running RH 7.1, and the problem turned out to be the interaction between
SMP and the APM BIOS, when APM is turned on. A different network card,
but the same symptom. Another symptom I would occasionally see was a
certain kind of hard-disk hang, but only on the integrated HPT366
controller.

I suggest you try either:

-- adding the "noapic" line to your kernel command-line (which will
lose you some I/O performance since normal interrupts will not be
handled APIC-style)
-- completely disabling APM from your kernel configuration. Using
"apm=off/disabled" (I can't remember the exact one you're supposed
to use here) does not totally disable APM usage.


This brings me to my other point. During the Linux kernel startup
code (in the early assembly), the APM BIOS checking code leaves the
BIOS in the "connected" state even if the kernel option for disabling
APM or the SMP forced disable of APM is triggered.

This makes various motherboards (such as the ABIT BP6) unstable.

The Right Thing to do would be to disconnect the APM BIOS if it is
determined that APM support should be disabled.

I could probably generate a patch to fix this if it looked like it would
be accepted by the folks maintaining APM support...

--
Erich Stefan Boleyn <[email protected]> http://www.uruk.org/
"Reality is truly stranger than fiction; Probably why fiction is so popular"

2001-10-29 00:13:06

by victor

[permalink] [raw]
Subject: Re: APM disable broken (was -> Re: 8139too on ABIT BP6 causes "eth0: transmit timed out" )

Hello erich,

Monday, October 29, 2001, 12:11:27 AM, you wrote:

i have a dual celeron in a bp6, i reflash de bios with
http://bp6.gamesquad.net/bios.phtml the bios revision
Final RU BIOS (newest Fianl Release BIOS from Abit)
and i have a ovislink 8139C chip and a hp 100mb switch and all works
fine


euo> Raphael Manfredi <[email protected]> wrote:

euo> ...[recent 2.4-based kernel]...

>> but this problem is not specific to that kernel. I've been having
>> it for a looong time.
>>
>> Specifically, I get:
>>
>> NETDEV WATCHDOG: eth0: transmit timed out
euo> ...
>> and then the machine is dead, network-wise. I have to reboot (reset).
>>
>> Note that I am on an ABIT BP6 board, and I do get a lot of APIC errors
>> under heavy network traffic, which is what raises the above.
>> By heavy network traffic, I mean a 7 Mb/s full duplex (it's a 100 Mb/s
>> LAN).

euo> I had what looks like exactly this problem with my ABIT BP6 -based machine
euo> running RH 7.1, and the problem turned out to be the interaction between
euo> SMP and the APM BIOS, when APM is turned on. A different network card,
euo> but the same symptom. Another symptom I would occasionally see was a
euo> certain kind of hard-disk hang, but only on the integrated HPT366
euo> controller.

euo> I suggest you try either:

euo> -- adding the "noapic" line to your kernel command-line (which will
euo> lose you some I/O performance since normal interrupts will not be
euo> handled APIC-style)
euo> -- completely disabling APM from your kernel configuration. Using
euo> "apm=off/disabled" (I can't remember the exact one you're supposed
euo> to use here) does not totally disable APM usage.


euo> This brings me to my other point. During the Linux kernel startup
euo> code (in the early assembly), the APM BIOS checking code leaves the
euo> BIOS in the "connected" state even if the kernel option for disabling
euo> APM or the SMP forced disable of APM is triggered.

euo> This makes various motherboards (such as the ABIT BP6) unstable.

euo> The Right Thing to do would be to disconnect the APM BIOS if it is
euo> determined that APM support should be disabled.

euo> I could probably generate a patch to fix this if it looked like it would
euo> be accepted by the folks maintaining APM support...

euo> --
euo> Erich Stefan Boleyn <[email protected]> http://www.uruk.org/
euo> "Reality is truly stranger than fiction; Probably why fiction is so popular"
euo> -
euo> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
euo> the body of a message to [email protected]
euo> More majordomo info at http://vger.kernel.org/majordomo-info.html
euo> Please read the FAQ at http://www.tux.org/lkml/



--
Best regards,
victor mailto:[email protected]

2001-10-29 02:43:51

by Oden Eriksson

[permalink] [raw]
Subject: Re: APM disable broken (was -> Re: 8139too on ABIT BP6 causes "eth0: transmit timed out" )

On Mondayen den 29 October 2001 00.11, [email protected] wrote:
> Raphael Manfredi <[email protected]> wrote:
>
> ...[recent 2.4-based kernel]...
>
> > but this problem is not specific to that kernel. I've been having
> > it for a looong time.
> >
> > Specifically, I get:
> >
> > NETDEV WATCHDOG: eth0: transmit timed out
>
> ...
>
> > and then the machine is dead, network-wise. I have to reboot (reset).
> >
> > Note that I am on an ABIT BP6 board, and I do get a lot of APIC errors
> > under heavy network traffic, which is what raises the above.
> > By heavy network traffic, I mean a 7 Mb/s full duplex (it's a 100 Mb/s
> > LAN).
>
> I had what looks like exactly this problem with my ABIT BP6 -based machine
> running RH 7.1, and the problem turned out to be the interaction between
> SMP and the APM BIOS, when APM is turned on. A different network card,
> but the same symptom. Another symptom I would occasionally see was a
> certain kind of hard-disk hang, but only on the integrated HPT366
> controller.

You might want to try the latest Abit BP6 "RU" bios with HPT366 bios v1.28,
get it at:

ftp://ftp.mathematik.uni-marburg.de/pub/mirror/abit/beta/bp6/bios/128/bp6ru128.zip

Chears.

--
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
| Oden Eriksson, Deserve-IT Networks, Jokkmokk, Sweden.
| Mandrake Linux release 8.2 (Cooker) for i586
| Current uptime with kernel 2.4.12-6mdksmp: 14min
| cpu0 @ 814.28 bm, fan 4500 rpm, temp +31.0?C
| cpu1 @ 815.92 bm, fan 4500 rpm, temp +29?C

2001-10-29 05:43:47

by Daniel R. Warner

[permalink] [raw]
Subject: Re: 8139too on ABIT BP6 causes "eth0: transmit timed out"

Raphael Manfredi wrote:

> Specifically, I get:
>
> NETDEV WATCHDOG: eth0: transmit timed out
> eth0: Tx queue start entry 32190531 dirty entry 32190527.
> eth0: Tx descriptor 0 is 00002000.
> eth0: Tx descriptor 1 is 00002000.
> eth0: Tx descriptor 2 is 00002000.
> eth0: Tx descriptor 3 is 00002000. (queue head)
> eth0: Setting 100mbps full-duplex based on auto-negotiated partner ability 45e1.
>
> and then the machine is dead, network-wise. I have to reboot (reset).

I have this problem when I compile the driver with PIO mode, except it
is that way as soon as the card is initalized. The only way for me to
use PIO is to put the Becker driver in its place.

-D


2001-10-29 13:43:02

by Raphael Manfredi

[permalink] [raw]
Subject: Re: 8139too on ABIT BP6 causes "eth0: transmit timed out"

Quoting Martin Eriksson <[email protected]> from ml.linux.kernel:
:What Bios are you running? You should be running the modified RU1.25 Bios,
:and have enabled MPS 1.4. Also disable "Spread Spectrum" and do *not*
:overclock.

I have the RU BIOS, I enabled MPS 1.4.
What is "Spread Spectrum?".
I don't overclock. I never do.

I remember seeing messages about network lockups on the ABIT BP6, and
there was a patch submitted to the list. It had to do with a Sysreq patch
that would "do something to the APIC" and re-enable network interrupts.

However, I don't remember who this was, and where the patch is.

Is the older Becker driver for the RTL8139 more robust? I know it's no longer
maintained, but if I want to try it, how do I proceed? I don't remember
seeing it offered in the configuration menu.

Raphael