2007-01-13 20:31:37

by Eric Buddington

[permalink] [raw]
Subject: 2.6.20-rc4-mm1 USB (asix) problem

The following problem occured on an Athlon64 X2 under 2.6.20-rc4-mm1,
but not 2.6.20-rc3-mm1.

I'm using two D-Link DUB-E100 USB ethernet adapters, using the 'asix'
driver. When I upgraded to 2.6.20-rc4-mm1, they were still recognized,
but various ifconfig operations on them (up/down, changing IP) caused
a system freeze (including caps lock/num lock lights) for many seconds.
I do not believe there was anything new in dmesg when the system
resumed. USB debugging was not turned on at the time, though the
problem is repeatable.

Also, no packets actually made it out of the adapters (watching from
other systems on the network).

Since this is a system we need running and networked, I can't do
extensive testing on it, but I might be to bring it down for a few
quick tests if that would help.

-Eric


2007-01-15 19:50:47

by Eric Buddington

[permalink] [raw]
Subject: Re: 2.6.20-rc4-mm1 USB (asix) problem

On Mon, Jan 15, 2007 at 07:27:56PM +0000, David Hollis wrote:
> Do you happen to have a Rev. B1 DLink adapter? If so, the only change
> that was put in (PHY Select fix) should actually make these devices
> work. Can you check the top of the ax88772_bind() call in your file and
> see if it has this bit:
>
> if ((ret = asix_write_cmd(dev, AX_CMD_SW_PHY_SELECT,
> 1, 0, 0, buf)) < 0) {
> dbg("Select PHY #1 failed: %d", ret);
> goto out2;
> }
>
>
> That '1' after the AX_CMD_SW_PHY_SELECT was the key to that patch. If
> yours is 1, you could try setting it to 0, though that should make
> things not work. I'd very interested if it made things work for you.
> BTW, the ramifications of this bug were similar to what you describe:
> the interface would come up, look fine but just wouldn't send or receive
> any packets. The hard lock-ups and such are likely from something else.

I don't know the Rev number of the adapter; I can't get to it physically
today, and I don't see it in dmesg.

The asix_write_cmd argument in question did indeed change from 0 to 1
between 2.6.20-rc3-mm1 and -rc4-mm1. I'll change it back, rebuild,
and test. Probably tomorrow.

Thanks.

-Eric

2007-01-15 20:29:08

by David Hollis

[permalink] [raw]
Subject: Re: 2.6.20-rc4-mm1 USB (asix) problem

On Sat, 2007-01-13 at 15:31 -0500, Eric Buddington wrote:
> The following problem occured on an Athlon64 X2 under 2.6.20-rc4-mm1,
> but not 2.6.20-rc3-mm1.
>
> I'm using two D-Link DUB-E100 USB ethernet adapters, using the 'asix'
> driver. When I upgraded to 2.6.20-rc4-mm1, they were still recognized,
> but various ifconfig operations on them (up/down, changing IP) caused
> a system freeze (including caps lock/num lock lights) for many seconds.
> I do not believe there was anything new in dmesg when the system
> resumed. USB debugging was not turned on at the time, though the
> problem is repeatable.
>
> Also, no packets actually made it out of the adapters (watching from
> other systems on the network).
>
> Since this is a system we need running and networked, I can't do
> extensive testing on it, but I might be to bring it down for a few
> quick tests if that would help.

Do you happen to have a Rev. B1 DLink adapter? If so, the only change
that was put in (PHY Select fix) should actually make these devices
work. Can you check the top of the ax88772_bind() call in your file and
see if it has this bit:

if ((ret = asix_write_cmd(dev, AX_CMD_SW_PHY_SELECT,
1, 0, 0, buf)) < 0) {
dbg("Select PHY #1 failed: %d", ret);
goto out2;
}


That '1' after the AX_CMD_SW_PHY_SELECT was the key to that patch. If
yours is 1, you could try setting it to 0, though that should make
things not work. I'd very interested if it made things work for you.
BTW, the ramifications of this bug were similar to what you describe:
the interface would come up, look fine but just wouldn't send or receive
any packets. The hard lock-ups and such are likely from something else.

--
David Hollis <[email protected]>

2007-01-15 20:32:41

by David Hollis

[permalink] [raw]
Subject: Re: 2.6.20-rc4-mm1 USB (asix) problem

On Mon, 2007-01-15 at 14:50 -0500, Eric Buddington wrote:

> The asix_write_cmd argument in question did indeed change from 0 to 1
> between 2.6.20-rc3-mm1 and -rc4-mm1. I'll change it back, rebuild,
> and test. Probably tomorrow.
>

Interesting. It would really be something if your devices happen to
work better with 0. Wouldn't make much sense at all unfortunately. If
0 works, could you also try setting it to 2 or 3? The PHY select value
is a bit field with the 0 bit being to select the onboard PHY, and 1 bit
being to 'auto-select' the PHY based on link status. The data sheet
indicates that 3 should be the default, but all of the literature I have
seen from ASIX says to write a 1 to it.

And FWIW, that change for setting it to 1 fixed a bunch of broken
adapters, including mine.

--
David Hollis <[email protected]>

2007-01-16 23:59:26

by Eric Buddington

[permalink] [raw]
Subject: Re: 2.6.20-rc4-mm1 USB (asix) problem

On Mon, Jan 15, 2007 at 08:32:17PM +0000, David Hollis wrote:
> Interesting. It would really be something if your devices happen to
> work better with 0. Wouldn't make much sense at all unfortunately. If
> 0 works, could you also try setting it to 2 or 3? The PHY select value
> is a bit field with the 0 bit being to select the onboard PHY, and 1 bit
> being to 'auto-select' the PHY based on link status. The data sheet
> indicates that 3 should be the default, but all of the literature I have
> seen from ASIX says to write a 1 to it.

My hardware is ver. B1.

0, 2, and 3 all worked for me. 1, as before, does not.

'rmmod asix' takes a really long time (45-80s) with any setting, and
sometimes coincides with ksoftirqd pegging (99.9% CPU) for several
seconds.

-Eric

2007-01-17 12:01:17

by David Hollis

[permalink] [raw]
Subject: Re: 2.6.20-rc4-mm1 USB (asix) problem

On Tue, 2007-01-16 at 17:59 -0500, Eric Buddington wrote:
> On Mon, Jan 15, 2007 at 08:32:17PM +0000, David Hollis wrote:
> > Interesting. It would really be something if your devices happen to
> > work better with 0. Wouldn't make much sense at all unfortunately. If
> > 0 works, could you also try setting it to 2 or 3? The PHY select value
> > is a bit field with the 0 bit being to select the onboard PHY, and 1 bit
> > being to 'auto-select' the PHY based on link status. The data sheet
> > indicates that 3 should be the default, but all of the literature I have
> > seen from ASIX says to write a 1 to it.
>
> My hardware is ver. B1.
>
> 0, 2, and 3 all worked for me. 1, as before, does not.
>

That's good to hear. Some other patches have started floating around to
deal with these cases of internal vs. external PHY's and also seem to
work.

> 'rmmod asix' takes a really long time (45-80s) with any setting, and
> sometimes coincides with ksoftirqd pegging (99.9% CPU) for several
> seconds.

This I haven't seen before. Does it occur even when the device is able
to work (using 0 or the like from above)? This may be due to something
else in the USB subsystem or something.

--
David Hollis <[email protected]>

2007-01-17 13:55:20

by Eric Buddington

[permalink] [raw]
Subject: Re: 2.6.20-rc4-mm1 USB (asix) problem

On Wed, Jan 17, 2007 at 07:00:48AM -0500, David Hollis wrote:
> > 'rmmod asix' takes a really long time (45-80s) with any setting, and
> > sometimes coincides with ksoftirqd pegging (99.9% CPU) for several
> > seconds.
>
> This I haven't seen before. Does it occur even when the device is able
> to work (using 0 or the like from above)? This may be due to something
> else in the USB subsystem or something.

Yes, the delay occurs even when the device works fine, and it results
in no suspicious dmesg's (just a couple of 'unregistering' messages).

I have no case when this delay doesn't occur; it's only in this
testing that I've had occasion to rmmod the driver at all. In and of
itself, it's not a big problem.

-Eric