2004-06-19 13:05:35

by Ian Kumlien

[permalink] [raw]
Subject: [sundance] Known problems?

Hi,

I changed my networking card back to my dlink DFE-580tx (one of those 4
port 100mbit cards that uses the alta chipset). And now the watchdog
keep killing the connection.

The max bw that is being used is aprox 5.5 megabyte/s (to avoid
confusion). Has there been any work done with this recently or during
2.5/6 development? Afair i had no problems with it in 2.4.x when my
firewall used it.

CC, I'm not subscribed.

Here is some info, i assume you need more...

On the server side:

NETDEV WATCHDOG: eth0: transmit timed out
eth0: Transmit timed out, TxStatus 00 TxFrameId 18, resetting...
00 2fcbe000 2fcbe010 00008001(00) 1d883002 800005ea
01 2fcbe010 2fcbe020 00008005(01) 180fb802 800005ea
02 2fcbe020 2fcbe030 00008009(02) 08630802 800005ea
03 2fcbe030 2fcbe040 0000800d(03) 2a4e7002 800005ea
04 2fcbe040 2fcbe050 00008011(04) 1897b002 800005ea
05 2fcbe050 2fcbe060 00008015(05) 2c51b002 800005ea
06 2fcbe060 2fcbe070 00008019(06) 0be05802 800005ea
07 2fcbe070 2fcbe080 0000801d(07) 0e7d2002 800005ea
08 2fcbe080 2fcbe090 00008021(08) 0e7d2802 800005ea
09 2fcbe090 2fcbe0a0 00008025(09) 2b02e402 8000017a
0a 2fcbe0a0 2fcbe0b0 00008029(0a) 25120802 800005ea
0b 2fcbe0b0 2fcbe0c0 0000802d(0b) 19ac5802 800005ea
0c 2fcbe0c0 2fcbe0d0 00008031(0c) 19ac5002 800005ea
0d 2fcbe0d0 2fcbe0e0 00008035(0d) 2bf58802 800005ea
0e 2fcbe0e0 2fcbe0f0 00008039(0e) 2bf58002 800005ea
0f 2fcbe0f0 2fcbe100 0000803d(0f) 145b6802 800005ea
10 2fcbe100 2fcbe110 00008041(10) 145b6002 800005ea
11 2fcbe110 2fcbe120 00008045(11) 152db802 800005ea
12 2fcbe120 2fcbe130 00008049(12) 152db002 800005ea
13 2fcbe130 2fcbe140 0000804d(13) 2d0de802 800005ea
14 2fcbe140 2fcbe150 00008051(14) 2d0de002 800005ea
15 2fcbe150 2fcbe160 00008055(15) 2c5d3802 800005ea
16 2fcbe160 00000000 00008059(16) 2c5d3002 800005ea
17 2fcbe170 2fcbe180 0001805d(17) 00000000 00000000
18 2fcbe180 2fcbe190 00018061(18) 00000000 00000000
19 2fcbe190 2fcbe1a0 00008065(19) 2d4d2802 800005ea
1a 2fcbe1a0 2fcbe1b0 00008069(1a) 17b26002 800005ea
1b 2fcbe1b0 2fcbe1c0 0000806d(1b) 2df94802 800005ea
1c 2fcbe1c0 2fcbe1d0 00008071(1c) 2bd26802 800005ea
1d 2fcbe1d0 2fcbe1e0 00008075(1d) 2bd26002 800005ea
1e 2fcbe1e0 2fcbe1f0 00008079(1e) 2d5c0802 800005ea
1f 2fcbe1f0 2fcbe000 0000807d(1f) 2d759802 800005ea
TxListPtr=2fcbe180 netif_queue_stopped=1
cur_tx=28023(17) dirty_tx=27993(19)
cur_rx=40 dirty_rx=40
cur_task=28023

On the client side:
nfs: server blue not responding, still trying

--
Ian Kumlien <pomac () vapor ! com> -- http://pomac.netswarm.net


Attachments:
signature.asc (189.00 B)
This is a digitally signed message part

2004-06-19 13:22:50

by Andre Tomt

[permalink] [raw]
Subject: Re: [sundance] Known problems?

Ian Kumlien wrote:

> Hi,
>
> I changed my networking card back to my dlink DFE-580tx (one of those 4
> port 100mbit cards that uses the alta chipset). And now the watchdog
> keep killing the connection.
>
> The max bw that is being used is aprox 5.5 megabyte/s (to avoid
> confusion). Has there been any work done with this recently or during
> 2.5/6 development? Afair i had no problems with it in 2.4.x when my
> firewall used it.
>
> CC, I'm not subscribed.

FYI;

Other than beeing a slow card with mmio-bugs, the only problems I have
had with that card was when having a kernel patched with the now defunct
and buggy IMQ. Problems were identical.

--
Cheers,
Andr? Tomt

2004-06-19 13:31:48

by Ian Kumlien

[permalink] [raw]
Subject: Re: [sundance] Known problems?

On Sat, 2004-06-19 at 15:22, Andre Tomt wrote:
> Ian Kumlien wrote:
>
> > Hi,
> >
> > I changed my networking card back to my dlink DFE-580tx (one of those 4
> > port 100mbit cards that uses the alta chipset). And now the watchdog
> > keep killing the connection.
> >
> > The max bw that is being used is aprox 5.5 megabyte/s (to avoid
> > confusion). Has there been any work done with this recently or during
> > 2.5/6 development? Afair i had no problems with it in 2.4.x when my
> > firewall used it.
> >
> > CC, I'm not subscribed.
>
> FYI;
>
> Other than beeing a slow card with mmio-bugs, the only problems I have
> had with that card was when having a kernel patched with the now defunct
> and buggy IMQ. Problems were identical.

Yeah i know about the MMIO bit, but i never had this problem before...
Even when loading it with full 100mbit bw (but that was on 2.4).

Can't it be to paranoid watchdog timings?
(Btw, what is IMO, I'd think it meant 'in my opinion' but, heh =))

--
Ian Kumlien <pomac () vapor ! com> -- http://pomac.netswarm.net


Attachments:
signature.asc (189.00 B)
This is a digitally signed message part

2004-06-19 15:56:18

by Andre Tomt

[permalink] [raw]
Subject: Re: [sundance] Known problems?

Ian Kumlien wrote:
> On Sat, 2004-06-19 at 15:22, Andre Tomt wrote:
>>FYI;
>>
>>Other than beeing a slow card with mmio-bugs, the only problems I have
>>had with that card was when having a kernel patched with the now defunct
>>and buggy IMQ. Problems were identical.
>
>
> Yeah i know about the MMIO bit, but i never had this problem before...
> Even when loading it with full 100mbit bw (but that was on 2.4).

I have not used the card since around 2.6.4, but it worked fine back
then. Did some performance testing on it, with high data and packets/s
rates, so it did get a fair amount of beating.

> Can't it be to paranoid watchdog timings?
> (Btw, what is IMO, I'd think it meant 'in my opinion' but, heh =))

Not IMO, IMQ, Q as in q ;-)

If you don't know what it is, you most likely aren't using it. I'm not
avare of any distributions having it applied. It's used for combinding
several network packet queues into one for example.

2004-06-19 16:08:56

by Ian Kumlien

[permalink] [raw]
Subject: Re: [sundance] Known problems?

On Sat, 2004-06-19 at 17:56, Andre Tomt wrote:
> Ian Kumlien wrote:
> > On Sat, 2004-06-19 at 15:22, Andre Tomt wrote:
> >>FYI;
> >>
> >>Other than beeing a slow card with mmio-bugs, the only problems I have
> >>had with that card was when having a kernel patched with the now defunct
> >>and buggy IMQ. Problems were identical.
> >
> > Yeah i know about the MMIO bit, but i never had this problem before...
> > Even when loading it with full 100mbit bw (but that was on 2.4).
>
> I have not used the card since around 2.6.4, but it worked fine back
> then. Did some performance testing on it, with high data and packets/s
> rates, so it did get a fair amount of beating.

Yeah, I noticed when testing that it is when you make use of fullduplex
that the driver goes all ape. Ie, the report i sent was about 1mb/s up
and the rest down of the total 5.5 mb/s.

> > Can't it be to paranoid watchdog timings?
> > (Btw, what is IMO, I'd think it meant 'in my opinion' but, heh =))
>
> Not IMO, IMQ, Q as in q ;-)

Ack, =)

> If you don't know what it is, you most likely aren't using it. I'm not
> avare of any distributions having it applied. It's used for combinding
> several network packet queues into one for example.

Which might help the case that i saw... Is it avail from somewhere?

My main problem is that enough of these watchdog thingies and i have
reboot the machine to reinit the hw. (ie, hw is constantly down no
matter what you do... )

--
Ian Kumlien <pomac () vapor ! com> -- http://pomac.netswarm.net


Attachments:
signature.asc (189.00 B)
This is a digitally signed message part

2004-06-19 17:02:56

by Ian Kumlien

[permalink] [raw]
Subject: Re: [sundance] Known problems?

On Sat, 2004-06-19 at 17:56, Andre Tomt wrote:
> If you don't know what it is, you most likely aren't using it. I'm not
> avare of any distributions having it applied. It's used for combinding
> several network packet queues into one for example.

RX packets:26633293 errors:0 dropped:672 overruns:0 frame:0
TX packets:10031287 errors:132 dropped:3960 overruns:0
carrier:0

To my knowledge all those drops was yesterday during a 2 hour period..
This is a local switched lan with flowcontrol enabled.
Couldn't this be tx getting nuts due to delays that rx packets causes?

Thus:
NETDEV WATCHDOG: eth0: transmit timed out
eth0: Transmit timed out, TxStatus 00 TxFrameId 14, resetting...

Anyways, something is wrong... but i assume it could be pci related:
5: 43355556 XT-PIC eth0
10: 19987656 XT-PIC aic7xxx, eth3
---
LOC: 143364244
ERR: 406517
---

Any clues?

--
Ian Kumlien <pomac () vapor ! com> -- http://pomac.netswarm.net


Attachments:
signature.asc (189.00 B)
This is a digitally signed message part