2007-02-05 05:13:12

by Robert Hancock

[permalink] [raw]
Subject: forcedeth problems on 2.6.20-rc6-mm3

Something's busted with forcedeth in 2.6.20-rc6-mm3 for me relative to
2.6.20-rc6. There's no errors in dmesg, but it seems no packets ever get
received and so the machine can't get an IP address. I tried reverting
all the -mm changes to drivers/net/forcedeth.c, which didn't help. The
network controller shares an IRQ with the USB OHCI controller which is
receiving interrupts, so it doesn't seem like an interrupt routing
problem, though I suppose something wierd could be happening there.

This is on an Asus A8N-SLI Deluxe (CK804 chipset) on x86_64.

Any suggestions on how to debug/what to try reverting to see what's
causing this?

--
Robert Hancock Saskatoon, SK, Canada
To email, remove "nospam" from [email protected]
Home Page: http://www.roberthancock.com/


2007-02-05 05:36:15

by Andrew Morton

[permalink] [raw]
Subject: Re: forcedeth problems on 2.6.20-rc6-mm3

On Sun, 04 Feb 2007 23:13:09 -0600 Robert Hancock <[email protected]> wrote:

> Something's busted with forcedeth in 2.6.20-rc6-mm3 for me relative to
> 2.6.20-rc6. There's no errors in dmesg, but it seems no packets ever get
> received and so the machine can't get an IP address. I tried reverting
> all the -mm changes to drivers/net/forcedeth.c, which didn't help. The
> network controller shares an IRQ with the USB OHCI controller which is
> receiving interrupts, so it doesn't seem like an interrupt routing
> problem, though I suppose something wierd could be happening there.
>
> This is on an Asus A8N-SLI Deluxe (CK804 chipset) on x86_64.
>
> Any suggestions on how to debug/what to try reverting to see what's
> causing this?

There are many forcedeth changes in git-netdev-all.patch. Can you
try reverting drivers/net/forcedeth.c back to the unpatched version
from 2.6.20-rc6?

Thanks.

2007-02-05 05:49:32

by Robert Hancock

[permalink] [raw]
Subject: Re: forcedeth problems on 2.6.20-rc6-mm3

Andrew Morton wrote:
> On Sun, 04 Feb 2007 23:13:09 -0600 Robert Hancock <[email protected]> wrote:
>
>> Something's busted with forcedeth in 2.6.20-rc6-mm3 for me relative to
>> 2.6.20-rc6. There's no errors in dmesg, but it seems no packets ever get
>> received and so the machine can't get an IP address. I tried reverting
>> all the -mm changes to drivers/net/forcedeth.c, which didn't help. The
>> network controller shares an IRQ with the USB OHCI controller which is
>> receiving interrupts, so it doesn't seem like an interrupt routing
>> problem, though I suppose something wierd could be happening there.
>>
>> This is on an Asus A8N-SLI Deluxe (CK804 chipset) on x86_64.
>>
>> Any suggestions on how to debug/what to try reverting to see what's
>> causing this?
>
> There are many forcedeth changes in git-netdev-all.patch. Can you
> try reverting drivers/net/forcedeth.c back to the unpatched version
> from 2.6.20-rc6?
>
> Thanks.
>

That's essentially what I did, it didn't appear to help. I assume the
problem must lie elsewhere..

--
Robert Hancock Saskatoon, SK, Canada
To email, remove "nospam" from [email protected]
Home Page: http://www.roberthancock.com/

2007-02-05 06:17:45

by Andrew Morton

[permalink] [raw]
Subject: Re: forcedeth problems on 2.6.20-rc6-mm3

On Sun, 04 Feb 2007 23:48:33 -0600 Robert Hancock <[email protected]> wrote:

> Andrew Morton wrote:
> > On Sun, 04 Feb 2007 23:13:09 -0600 Robert Hancock <[email protected]> wrote:
> >
> >> Something's busted with forcedeth in 2.6.20-rc6-mm3 for me relative to
> >> 2.6.20-rc6. There's no errors in dmesg, but it seems no packets ever get
> >> received and so the machine can't get an IP address. I tried reverting
> >> all the -mm changes to drivers/net/forcedeth.c, which didn't help. The
> >> network controller shares an IRQ with the USB OHCI controller which is
> >> receiving interrupts, so it doesn't seem like an interrupt routing
> >> problem, though I suppose something wierd could be happening there.
> >>
> >> This is on an Asus A8N-SLI Deluxe (CK804 chipset) on x86_64.
> >>
> >> Any suggestions on how to debug/what to try reverting to see what's
> >> causing this?
> >
> > There are many forcedeth changes in git-netdev-all.patch. Can you
> > try reverting drivers/net/forcedeth.c back to the unpatched version
> > from 2.6.20-rc6?
> >
> > Thanks.
> >
>
> That's essentially what I did, it didn't appear to help. I assume the
> problem must lie elsewhere..
>

doh, I missed that.

It's presumably not the driver and nobody else seems to be hitting this, so
it must be something peculiar to your setup. But I don't know what it
might be, sorry.

2007-02-05 06:34:56

by Daniel Barkalow

[permalink] [raw]
Subject: Re: forcedeth problems on 2.6.20-rc6-mm3

On Sun, 4 Feb 2007, Robert Hancock wrote:

> Something's busted with forcedeth in 2.6.20-rc6-mm3 for me relative to
> 2.6.20-rc6. There's no errors in dmesg, but it seems no packets ever get
> received and so the machine can't get an IP address. I tried reverting all the
> -mm changes to drivers/net/forcedeth.c, which didn't help. The network
> controller shares an IRQ with the USB OHCI controller which is receiving
> interrupts, so it doesn't seem like an interrupt routing problem, though I
> suppose something wierd could be happening there.

IIRC, forcedeth tries to use MSI by default. Perhaps the hardware is using
it, but the kernel thinks enabling it didn't work? I think there's a
module option for forcedeth to disable MSI, which might be worth a try to
see if it has any effect.

-Daniel
*This .sig left intentionally blank*

2007-02-06 00:36:05

by Robert Hancock

[permalink] [raw]
Subject: Re: forcedeth problems on 2.6.20-rc6-mm3

Daniel Barkalow wrote:
> On Sun, 4 Feb 2007, Robert Hancock wrote:
>
>> Something's busted with forcedeth in 2.6.20-rc6-mm3 for me relative to
>> 2.6.20-rc6. There's no errors in dmesg, but it seems no packets ever get
>> received and so the machine can't get an IP address. I tried reverting all the
>> -mm changes to drivers/net/forcedeth.c, which didn't help. The network
>> controller shares an IRQ with the USB OHCI controller which is receiving
>> interrupts, so it doesn't seem like an interrupt routing problem, though I
>> suppose something wierd could be happening there.
>
> IIRC, forcedeth tries to use MSI by default. Perhaps the hardware is using
> it, but the kernel thinks enabling it didn't work? I think there's a
> module option for forcedeth to disable MSI, which might be worth a try to
> see if it has any effect.

I must have messed something up when testing before - reverting to
forcedeth.c from 2.6.20-rc6 does indeed fix the problem. And it doesn't
seem like no packets at all are received with the -mm3 version (driver
version 0.60), either - if I do a tcpdump I can get Ethernet packets
showing up, but I can't ping my router so it seems like something isn't
getting through properly. With the 2.6.20-rc6 version (driver version
0.59) it works fine. I switched back and forth between versions and this
seems repeatable.

I don't think it's MSI related, the CK804 version of these controllers
doesn't support MSI and the driver shouldn't be trying to use it. I
tried the MSI and MSI-X disable options on the 0.60 driver, but that
didn't help.

--
Robert Hancock Saskatoon, SK, Canada
To email, remove "nospam" from [email protected]
Home Page: http://www.roberthancock.com/

2007-02-06 00:52:36

by Andrew Morton

[permalink] [raw]
Subject: Re: forcedeth problems on 2.6.20-rc6-mm3

On Mon, 05 Feb 2007 18:35:06 -0600
Robert Hancock <[email protected]> wrote:

> Daniel Barkalow wrote:
> > On Sun, 4 Feb 2007, Robert Hancock wrote:
> >
> >> Something's busted with forcedeth in 2.6.20-rc6-mm3 for me relative to
> >> 2.6.20-rc6. There's no errors in dmesg, but it seems no packets ever get
> >> received and so the machine can't get an IP address. I tried reverting all the
> >> -mm changes to drivers/net/forcedeth.c, which didn't help. The network
> >> controller shares an IRQ with the USB OHCI controller which is receiving
> >> interrupts, so it doesn't seem like an interrupt routing problem, though I
> >> suppose something wierd could be happening there.
> >
> > IIRC, forcedeth tries to use MSI by default. Perhaps the hardware is using
> > it, but the kernel thinks enabling it didn't work? I think there's a
> > module option for forcedeth to disable MSI, which might be worth a try to
> > see if it has any effect.
>
> I must have messed something up when testing before - reverting to
> forcedeth.c from 2.6.20-rc6 does indeed fix the problem. And it doesn't
> seem like no packets at all are received with the -mm3 version (driver
> version 0.60), either - if I do a tcpdump I can get Ethernet packets
> showing up, but I can't ping my router so it seems like something isn't
> getting through properly. With the 2.6.20-rc6 version (driver version
> 0.59) it works fine. I switched back and forth between versions and this
> seems repeatable.
>
> I don't think it's MSI related, the CK804 version of these controllers
> doesn't support MSI and the driver shouldn't be trying to use it. I
> tried the MSI and MSI-X disable options on the 0.60 driver, but that
> didn't help.
>

OK, thanks. Jeff, please note that the forcedeth changes in git-netdev-all
have a problem.

2007-02-08 05:33:56

by Andrew Morton

[permalink] [raw]
Subject: Re: forcedeth problems on 2.6.20-rc6-mm3

On Mon, 5 Feb 2007 16:52:24 -0800 Andrew Morton <[email protected]> wrote:

> On Mon, 05 Feb 2007 18:35:06 -0600
> Robert Hancock <[email protected]> wrote:
>
> > Daniel Barkalow wrote:
> > > On Sun, 4 Feb 2007, Robert Hancock wrote:
> > >
> > >> Something's busted with forcedeth in 2.6.20-rc6-mm3 for me relative to
> > >> 2.6.20-rc6. There's no errors in dmesg, but it seems no packets ever get
> > >> received and so the machine can't get an IP address. I tried reverting all the
> > >> -mm changes to drivers/net/forcedeth.c, which didn't help. The network
> > >> controller shares an IRQ with the USB OHCI controller which is receiving
> > >> interrupts, so it doesn't seem like an interrupt routing problem, though I
> > >> suppose something wierd could be happening there.
> > >
> > > IIRC, forcedeth tries to use MSI by default. Perhaps the hardware is using
> > > it, but the kernel thinks enabling it didn't work? I think there's a
> > > module option for forcedeth to disable MSI, which might be worth a try to
> > > see if it has any effect.
> >
> > I must have messed something up when testing before - reverting to
> > forcedeth.c from 2.6.20-rc6 does indeed fix the problem. And it doesn't
> > seem like no packets at all are received with the -mm3 version (driver
> > version 0.60), either - if I do a tcpdump I can get Ethernet packets
> > showing up, but I can't ping my router so it seems like something isn't
> > getting through properly. With the 2.6.20-rc6 version (driver version
> > 0.59) it works fine. I switched back and forth between versions and this
> > seems repeatable.
> >
> > I don't think it's MSI related, the CK804 version of these controllers
> > doesn't support MSI and the driver shouldn't be trying to use it. I
> > tried the MSI and MSI-X disable options on the 0.60 driver, but that
> > didn't help.
> >
>
> OK, thanks. Jeff, please note that the forcedeth changes in git-netdev-all
> have a problem.

Well all the forcedeth patches seems to have wandered into mainline anyway.

Please test current git (or tomorrow's git snapshot), see if it works?

2007-02-09 04:56:28

by Ayaz Abdulla

[permalink] [raw]
Subject: Re: forcedeth problems on 2.6.20-rc6-mm3

--- orig/drivers/net/forcedeth.c 2007-02-08 21:41:59.000000000 -0500
+++ new/drivers/net/forcedeth.c 2007-02-08 21:44:53.000000000 -0500
@@ -3104,13 +3104,17 @@
struct fe_priv *np = netdev_priv(dev);
u8 __iomem *base = get_hwbase(dev);
unsigned long flags;
+ u32 retcode;

- if (np->desc_ver == DESC_VER_1 || np->desc_ver == DESC_VER_2)
+ if (np->desc_ver == DESC_VER_1 || np->desc_ver == DESC_VER_2) {
pkts = nv_rx_process(dev, limit);
- else
+ retcode = nv_alloc_rx(dev);
+ } else {
pkts = nv_rx_process_optimized(dev, limit);
+ retcode = nv_alloc_rx_optimized(dev);
+ }

- if (nv_alloc_rx(dev)) {
+ if (retcode) {
spin_lock_irqsave(&np->lock, flags);
if (!np->in_shutdown)
mod_timer(&np->oom_kick, jiffies + OOM_REFILL);


Attachments:
forcedeth_napi_fix (750.00 B)

2007-02-09 11:58:28

by Tobias Diedrich

[permalink] [raw]
Subject: Re: forcedeth problems on 2.6.20-rc6-mm3

Ayaz Abdulla wrote:
> For all those who are having issues, please try out the attached patch.

Will try.
I reverted to 2.6.19 w/o suspend/resume patch last weekend to make
sure on 2.6.19 forcedeth is stable and noticed something odd:

Because I didn't include the suspend/resume patch I obviously had to
a down/rmmod/modprobe/up cycle after each resume and I noticed that
the behaviour seems to alternate between resumes:

Behaviour 1:
After modprobe I get two interface 'eth0' and 'eth1' for the two
ports, as expected.

Behaviour 2:
After modprobe I get one interface 'eth3' (which should be 'eth1')
and one interface with increasing numbers (which should be 'eth0',
last resume it was 'eth12' IIRC).

As I said if I get behaviour 1 on one resume I get behaviour 2 on
the next resume and vice versa. That seems rather odd to me.

On a not quite related note, forcedeth shows a different ethtool
output (compared to e100), when no cable is connected to the port:

forcedeth, no cable connected:
|Settings for eth1:
| Supported ports: [ MII ]
| Supported link modes: 10baseT/Half 10baseT/Full
| 100baseT/Half 100baseT/Full
| 1000baseT/Full
| Supports auto-negotiation: Yes
| Advertised link modes: 10baseT/Half 10baseT/Full
| 100baseT/Half 100baseT/Full
| 1000baseT/Full
| Advertised auto-negotiation: Yes
| Speed: Unknown! (65535)
| Duplex: Unknown! (255)
| Port: MII
| PHYAD: 1
| Transceiver: external
| Auto-negotiation: on
| Supports Wake-on: g
| Wake-on: d
| Link detected: no

e100, no cable connected:
|Settings for eth0:
| Supported ports: [ TP MII ]
| Supported link modes: 10baseT/Half 10baseT/Full
| 100baseT/Half 100baseT/Full
| Supports auto-negotiation: Yes
| Advertised link modes: 10baseT/Half 10baseT/Full
| 100baseT/Half 100baseT/Full
| Advertised auto-negotiation: Yes
| Speed: 10Mb/s
| Duplex: Half
| Port: MII
| PHYAD: 1
| Transceiver: internal
| Auto-negotiation: on
| Supports Wake-on: g
| Wake-on: g
| Current message level: 0x00000007 (7)
| Link detected: no

Note that e100 returns the lowest possible speed if no link is
detected, while forcedeth seems to return -1, which ethtool doesn't
seem to recognise as a valid response (I guess, why else would it
show the number after 'Unknown!').

--
Tobias PGP: http://9ac7e0bc.uguu.de
このメールは十割再利用されたビットで作られています。

2007-02-09 12:07:34

by Tobias Diedrich

[permalink] [raw]
Subject: Re: forcedeth problems on 2.6.20-rc6-mm3

Tobias Diedrich wrote:
> Ayaz Abdulla wrote:
> > For all those who are having issues, please try out the attached patch.
>
> Will try.

Does not apply cleanly against 2.6.20, is this one fixed up right?
--- linux-2.6.20/drivers/net/forcedeth.c.orig 2007-02-09 13:02:02.000000000 +0100
+++ linux-2.6.20/drivers/net/forcedeth.c.new 2007-02-09 13:03:45.000000000 +0100
@@ -2603,10 +2603,16 @@
struct fe_priv *np = netdev_priv(dev);
u8 __iomem *base = get_hwbase(dev);
unsigned long flags;
+ u32 retcode;

- pkts = nv_rx_process(dev, limit);
+ if (np->desc_ver == DESC_VER_1 || np->desc_ver == DESC_VER_2) {
+ pkts = nv_rx_process(dev, limit);
+ retcode = nv_alloc_rx(dev);
+ } else {
+ retcode = nv_alloc_rx_optimized(dev);
+ }

- if (nv_alloc_rx(dev)) {
+ if (retcode) {
spin_lock_irqsave(&np->lock, flags);
if (!np->in_shutdown)
mod_timer(&np->oom_kick, jiffies + OOM_REFILL);

--
Tobias PGP: http://9ac7e0bc.uguu.de
このメールは十割再利用されたビットで作られています。

2007-02-09 14:50:40

by Jeff Garzik

[permalink] [raw]
Subject: Re: forcedeth problems on 2.6.20-rc6-mm3

Tobias Diedrich wrote:
> Tobias Diedrich wrote:
>> Ayaz Abdulla wrote:
>>> For all those who are having issues, please try out the attached patch.
>> Will try.
>
> Does not apply cleanly against 2.6.20, is this one fixed up right?

It probably needs to be top of 2.6.20-git-latest or 2.6.20-rc6-mm3.

IOW, the forcedeth changes in question are not in 2.6.20, and you need
to apply the patch on top of the latest batch of forcedeth changes.

Jeff



2007-02-10 04:41:04

by Robert Hancock

[permalink] [raw]
Subject: Re: forcedeth problems on 2.6.20-rc6-mm3

Ayaz Abdulla wrote:
> For all those who are having issues, please try out the attached patch.
>
> Ayaz

Seems to solve the problem for me (not heavily tested, but certainly
isn't totally dead as it was before).

--
Robert Hancock Saskatoon, SK, Canada
To email, remove "nospam" from [email protected]
Home Page: http://www.roberthancock.com/

2007-02-12 06:28:30

by Tobias Diedrich

[permalink] [raw]
Subject: Re: forcedeth problems on 2.6.20-rc6-mm3

Jeff Garzik wrote:
> Tobias Diedrich wrote:
> >Tobias Diedrich wrote:
> >>Ayaz Abdulla wrote:
> >>>For all those who are having issues, please try out the attached patch.
> >>Will try.
> >
> >Does not apply cleanly against 2.6.20, is this one fixed up right?
>
> It probably needs to be top of 2.6.20-git-latest or 2.6.20-rc6-mm3.
>
> IOW, the forcedeth changes in question are not in 2.6.20, and you need
> to apply the patch on top of the latest batch of forcedeth changes.

Well, it hasn't blown up on me despite being applied to 2.6.20...
The problem I was seeing might even be fixed in 2.6.20 vanilla,
since the last version I saw it in was 2.6.20-rc6 and then I
reverted to 2.6.19 to make sure that one is ok (see
[email protected]).

--
Tobias PGP: http://9ac7e0bc.uguu.de
このメールは十割再利用されたビットで作られています。

2007-02-16 14:54:25

by Tobias Diedrich

[permalink] [raw]
Subject: Re: forcedeth problems on 2.6.20-rc6-mm3

Tobias Diedrich wrote:
> Jeff Garzik wrote:
> > Tobias Diedrich wrote:
> > >Tobias Diedrich wrote:
> > >>Ayaz Abdulla wrote:
> > >>>For all those who are having issues, please try out the attached patch.
> > >>Will try.
> > >
> > >Does not apply cleanly against 2.6.20, is this one fixed up right?
> >
> > It probably needs to be top of 2.6.20-git-latest or 2.6.20-rc6-mm3.
> >
> > IOW, the forcedeth changes in question are not in 2.6.20, and you need
> > to apply the patch on top of the latest batch of forcedeth changes.
>
> Well, it hasn't blown up on me despite being applied to 2.6.20...
> The problem I was seeing might even be fixed in 2.6.20 vanilla,
> since the last version I saw it in was 2.6.20-rc6 and then I
> reverted to 2.6.19 to make sure that one is ok (see
> [email protected]).

And having run vanilla 2.6.20 since my last mail, I haven't seen the
problem on that one either.
So I _guess_ the particular problem I was seeing was fixed somewhere
between 2.6.20-rc6 and 2.6.20. But since I can't reliably trigger
it, I can't say that for sure.

--
Tobias PGP: http://9ac7e0bc.uguu.de
このメールは十割再利用されたビットで作られています。

2007-02-20 00:39:11

by Robert Hancock

[permalink] [raw]
Subject: Re: forcedeth problems on 2.6.20-rc6-mm3

Ayaz Abdulla wrote:
>
> For all those who are having issues, please try out the attached patch.
>
> Ayaz
>
>
> -----------------------------------------------------------------------------------
>
> This email message is for the sole use of the intended recipient(s) and
> may contain
> confidential information. Any unauthorized review, use, disclosure or
> distribution
> is prohibited. If you are not the intended recipient, please contact
> the sender by
> reply email and destroy all copies of the original message.
> -----------------------------------------------------------------------------------
>
>
>
> ------------------------------------------------------------------------
>
> --- orig/drivers/net/forcedeth.c 2007-02-08 21:41:59.000000000 -0500
> +++ new/drivers/net/forcedeth.c 2007-02-08 21:44:53.000000000 -0500
> @@ -3104,13 +3104,17 @@
> struct fe_priv *np = netdev_priv(dev);
> u8 __iomem *base = get_hwbase(dev);
> unsigned long flags;
> + u32 retcode;
>
> - if (np->desc_ver == DESC_VER_1 || np->desc_ver == DESC_VER_2)
> + if (np->desc_ver == DESC_VER_1 || np->desc_ver == DESC_VER_2) {
> pkts = nv_rx_process(dev, limit);
> - else
> + retcode = nv_alloc_rx(dev);
> + } else {
> pkts = nv_rx_process_optimized(dev, limit);
> + retcode = nv_alloc_rx_optimized(dev);
> + }
>
> - if (nv_alloc_rx(dev)) {
> + if (retcode) {
> spin_lock_irqsave(&np->lock, flags);
> if (!np->in_shutdown)
> mod_timer(&np->oom_kick, jiffies + OOM_REFILL);

Did anyone push this patch into mainline? forcedeth on 2.6.20-git14 is
still completely broken without this patch.

--
Robert Hancock Saskatoon, SK, Canada
To email, remove "nospam" from [email protected]
Home Page: http://www.roberthancock.com/


2007-02-20 00:49:01

by Ayaz Abdulla

[permalink] [raw]
Subject: Re: forcedeth problems on 2.6.20-rc6-mm3



Robert Hancock wrote:
> Ayaz Abdulla wrote:
>
>>
>> For all those who are having issues, please try out the attached patch.
>>
>> Ayaz
>>
>>
>> -----------------------------------------------------------------------------------
>>
>> This email message is for the sole use of the intended recipient(s)
>> and may contain
>> confidential information. Any unauthorized review, use, disclosure or
>> distribution
>> is prohibited. If you are not the intended recipient, please contact
>> the sender by
>> reply email and destroy all copies of the original message.
>> -----------------------------------------------------------------------------------
>>
>>
>>
>> ------------------------------------------------------------------------
>>
>> --- orig/drivers/net/forcedeth.c 2007-02-08 21:41:59.000000000 -0500
>> +++ new/drivers/net/forcedeth.c 2007-02-08 21:44:53.000000000 -0500
>> @@ -3104,13 +3104,17 @@
>> struct fe_priv *np = netdev_priv(dev);
>> u8 __iomem *base = get_hwbase(dev);
>> unsigned long flags;
>> + u32 retcode;
>>
>> - if (np->desc_ver == DESC_VER_1 || np->desc_ver == DESC_VER_2)
>> + if (np->desc_ver == DESC_VER_1 || np->desc_ver == DESC_VER_2) {
>> pkts = nv_rx_process(dev, limit);
>> - else
>> + retcode = nv_alloc_rx(dev);
>> + } else {
>> pkts = nv_rx_process_optimized(dev, limit);
>> + retcode = nv_alloc_rx_optimized(dev);
>> + }
>>
>> - if (nv_alloc_rx(dev)) {
>> + if (retcode) {
>> spin_lock_irqsave(&np->lock, flags);
>> if (!np->in_shutdown)
>> mod_timer(&np->oom_kick, jiffies + OOM_REFILL);
>
>
> Did anyone push this patch into mainline? forcedeth on 2.6.20-git14 is
> still completely broken without this patch.
>

I have submitted the patch to netdev mailing list.