2005-03-30 18:03:44

by Jim Gifford

[permalink] [raw]
Subject: 64bit build of tulip driver

Under 32bit the tulip driver works fine, but under 64 bit it gives me a
lot if problems. I updated the tulip
to what is in the current repository, and the issue still exists. Any
suggestions.

First off it continually sends data out the network interface and never
negotiates is speed and duplex.
Second in the log files all I see is an uninformative message
0000:00:07.0: tulip_stop_rxtx() failed

Here is all the bootup information differences I can find on the driver
64 bit
Dec 31 16:01:29 lfs tulip0: ***WARNING***: No MII transceiver found!
Dec 31 16:01:29 lfs tulip1: ***WARNING***: No MII transceiver found!
32 bit
Dec 31 16:01:16 lfs tulip0: MII transceiver #1 config 1000 status 7809
advertising 01e1
Dec 31 16:01:16 lfs tulip1: MII transceiver #1 config 1000 status 7809
advertising 01e1.

Complete boot log - yes I know the date and time are off.
Under a 64 bit compile
Dec 31 16:01:29 lfs Linux Tulip driver version 1.1.13 (May 11, 2002)
Dec 31 16:01:29 lfs PCI: Enabling device 0000:00:07.0 (0045 -> 0047)
Dec 31 16:01:29 lfs tulip0: Old format EEPROM on 'Cobalt Microserver'
board. Using substitute media control info.
Dec 31 16:01:29 lfs tulip0: EEPROM default media type Autosense.
Dec 31 16:01:29 lfs tulip0: Index #0 - Media MII (#11) described by a
21142 MII PHY (3) block.
Dec 31 16:01:29 lfs tulip0: ***WARNING***: No MII transceiver found!
Dec 31 16:01:29 lfs eth0: Digital DS21143 Tulip rev 65 at
ffffffffb0001400, 00:10:E0:00:32:DE, IRQ 19.
Dec 31 16:01:29 lfs PCI: Enabling device 0000:00:0c.0 (0005 -> 0007)
Dec 31 16:01:29 lfs tulip1: Old format EEPROM on 'Cobalt Microserver'
board. Using substitute media control info.
Dec 31 16:01:29 lfs tulip1: EEPROM default media type Autosense.
Dec 31 16:01:29 lfs tulip1: Index #0 - Media MII (#11) described by a
21142 MII PHY (3) block.
Dec 31 16:01:29 lfs tulip1: ***WARNING***: No MII transceiver found!
Dec 31 16:01:29 lfs eth1: Digital DS21143 Tulip rev 65 at
ffffffffb0001480, 00:10:E0:00:32:DF, IRQ 20.
Dec 31 16:01:29 lfs bootlog: Bringing up the eth0 interface...[ OK ]
Dec 31 16:01:30 lfs bootlog: Adding IPv4 address 172.16.0.99 to the
eth0 interface...[ OK ]
Dec 31 16:01:31 lfs bootlog: Setting up default gateway...[ OK ]
Dec 31 16:01:32 lfs 0000:00:07.0: tulip_stop_rxtx() failed
Dec 31 16:01:38 lfs 0000:00:07.0: tulip_stop_rxtx() failed
Dec 31 16:01:44 lfs 0000:00:07.0: tulip_stop_rxtx() failed
Dec 31 16:01:50 lfs 0000:00:07.0: tulip_stop_rxtx() failed
Dec 31 16:01:56 lfs 0000:00:07.0: tulip_stop_rxtx() failed
Dec 31 16:02:02 lfs 0000:00:07.0: tulip_stop_rxtx() failed
Dec 31 16:02:08 lfs 0000:00:07.0: tulip_stop_rxtx() failed

Under 32 bit
Dec 31 16:01:16 lfs Linux Tulip driver version 1.1.13 (May 11, 2002)
Dec 31 16:01:16 lfs PCI: Enabling device 0000:00:07.0 (0045 -> 0047)
Dec 31 16:01:16 lfs tulip0: Old format EEPROM on 'Cobalt Microserver'
board. Using substitute media control info.
Dec 31 16:01:16 lfs tulip0: EEPROM default media type Autosense.
Dec 31 16:01:16 lfs tulip0: Index #0 - Media MII (#11) described by a
21142 MII PHY (3) block.
Dec 31 16:01:16 lfs tulip0: MII transceiver #1 config 1000 status 7809
advertising 01e1.
Dec 31 16:01:16 lfs eth0: Digital DS21143 Tulip rev 65 at b0001400,
00:10:E0:00:32:DE, IRQ 19.
Dec 31 16:01:16 lfs tulip1: Old format EEPROM on 'Cobalt Microserver'
board. Using substitute media control info.
Dec 31 16:01:16 lfs tulip1: EEPROM default media type Autosense.
Dec 31 16:01:16 lfs tulip1: Index #0 - Media MII (#11) described by a
21142 MII PHY (3) block.
Dec 31 16:01:16 lfs tulip1: MII transceiver #1 config 1000 status 7809
advertising 01e1.
Dec 31 16:01:16 lfs eth1: Digital DS21143 Tulip rev 65 at b0001480,
00:10:E0:00:32:DF, IRQ 20.
Dec 31 16:01:17 lfs bootlog: Bringing up the eth0 interface...[ OK ]
Dec 31 16:01:17 lfs bootlog: Adding IPv4 address 172.16.0.99 to the
eth0 interface...[ OK ]
Dec 31 16:01:18 lfs bootlog: Setting up default gateway...[ OK ]
Dec 31 16:01:20 lfs eth0: Setting full-duplex based on MII#1 link
partner capability of 45e1.

--
----
Jim Gifford
[email protected]


2005-03-31 16:10:42

by Grant Grundler

[permalink] [raw]
Subject: Re: 64bit build of tulip driver

On Wed, Mar 30, 2005 at 10:03:12AM -0800, Jim Gifford wrote:
> Under 32bit the tulip driver works fine, but under 64 bit it gives me a
> lot if problems.

Sorry - I'm not seeing issues on either ia64 or parisc 64-bit systems.
But I'm only using HP 100BT cards (4-port, occasionally variants of
single port cards, and built-in on parisc workstations/servers).

2.6.12-rc1 bits seem to work fine on a500 (aka rp2470).


> I updated the tulip to what is in the current repository, and the issue
> still exists. Any suggestions.
>
> First off it continually sends data out the network interface and never
> negotiates is speed and duplex.
> Second in the log files all I see is an uninformative message
> 0000:00:07.0: tulip_stop_rxtx() failed
>
> Here is all the bootup information differences I can find on the driver

Are there any config option differences?
e.g. MWI or MMIO options enabled on 64-bit but not 32-bit?

> 64 bit
> Dec 31 16:01:29 lfs tulip0: ***WARNING***: No MII transceiver found!
> Dec 31 16:01:29 lfs tulip1: ***WARNING***: No MII transceiver found!

You'll have to add printk's until you can sort out why the MII transceiver
isn't responding. Odds are 64-bit code runs faster than 32-bit on
the same machine (more registers or something).

> 32 bit
> Dec 31 16:01:16 lfs tulip0: MII transceiver #1 config 1000 status 7809
> advertising 01e1
> Dec 31 16:01:16 lfs tulip1: MII transceiver #1 config 1000 status 7809
> advertising 01e1.
>
> Complete boot log - yes I know the date and time are off.
> Under a 64 bit compile
> Dec 31 16:01:29 lfs Linux Tulip driver version 1.1.13 (May 11, 2002)

Interesting My source tree says:
#define DRV_RELDATE "December 15, 2004"
(same version # though)

> Dec 31 16:01:29 lfs PCI: Enabling device 0000:00:07.0 (0045 -> 0047)
> Dec 31 16:01:29 lfs tulip0: Old format EEPROM on 'Cobalt Microserver'
> board. Using substitute media control info.
> Dec 31 16:01:29 lfs tulip0: EEPROM default media type Autosense.
> Dec 31 16:01:29 lfs tulip0: Index #0 - Media MII (#11) described by a
> 21142 MII PHY (3) block.
> Dec 31 16:01:29 lfs tulip0: ***WARNING***: No MII transceiver found!
> Dec 31 16:01:29 lfs eth0: Digital DS21143 Tulip rev 65 at
> ffffffffb0001400, 00:10:E0:00:32:DE, IRQ 19.

HP is using exactly this chip. Difference seems to be with the phy/MII.

> Dec 31 16:01:29 lfs PCI: Enabling device 0000:00:0c.0 (0005 -> 0007)
> Dec 31 16:01:29 lfs tulip1: Old format EEPROM on 'Cobalt Microserver'
> board. Using substitute media control info.
> Dec 31 16:01:29 lfs tulip1: EEPROM default media type Autosense.
> Dec 31 16:01:29 lfs tulip1: Index #0 - Media MII (#11) described by a
> 21142 MII PHY (3) block.
> Dec 31 16:01:29 lfs tulip1: ***WARNING***: No MII transceiver found!
> Dec 31 16:01:29 lfs eth1: Digital DS21143 Tulip rev 65 at
> ffffffffb0001480, 00:10:E0:00:32:DF, IRQ 20.
> Dec 31 16:01:29 lfs bootlog: Bringing up the eth0 interface...[ OK ]
> Dec 31 16:01:30 lfs bootlog: Adding IPv4 address 172.16.0.99 to the
> eth0 interface...[ OK ]
> Dec 31 16:01:31 lfs bootlog: Setting up default gateway...[ OK ]
> Dec 31 16:01:32 lfs 0000:00:07.0: tulip_stop_rxtx() failed
> Dec 31 16:01:38 lfs 0000:00:07.0: tulip_stop_rxtx() failed
> Dec 31 16:01:44 lfs 0000:00:07.0: tulip_stop_rxtx() failed
> Dec 31 16:01:50 lfs 0000:00:07.0: tulip_stop_rxtx() failed
> Dec 31 16:01:56 lfs 0000:00:07.0: tulip_stop_rxtx() failed
> Dec 31 16:02:02 lfs 0000:00:07.0: tulip_stop_rxtx() failed
> Dec 31 16:02:08 lfs 0000:00:07.0: tulip_stop_rxtx() failed

ISTR to remember submitting a patch so additional data
gets printed in tulip_stop_rxtx. Here is a reference to the patch
but I don't think it is relevant to the this problem:
http://lkml.org/lkml/2004/12/15/119

grant

> Under 32 bit
> Dec 31 16:01:16 lfs Linux Tulip driver version 1.1.13 (May 11, 2002)
> Dec 31 16:01:16 lfs PCI: Enabling device 0000:00:07.0 (0045 -> 0047)
> Dec 31 16:01:16 lfs tulip0: Old format EEPROM on 'Cobalt Microserver'
> board. Using substitute media control info.
> Dec 31 16:01:16 lfs tulip0: EEPROM default media type Autosense.
> Dec 31 16:01:16 lfs tulip0: Index #0 - Media MII (#11) described by a
> 21142 MII PHY (3) block.
> Dec 31 16:01:16 lfs tulip0: MII transceiver #1 config 1000 status 7809
> advertising 01e1.
> Dec 31 16:01:16 lfs eth0: Digital DS21143 Tulip rev 65 at b0001400,
> 00:10:E0:00:32:DE, IRQ 19.
> Dec 31 16:01:16 lfs tulip1: Old format EEPROM on 'Cobalt Microserver'
> board. Using substitute media control info.
> Dec 31 16:01:16 lfs tulip1: EEPROM default media type Autosense.
> Dec 31 16:01:16 lfs tulip1: Index #0 - Media MII (#11) described by a
> 21142 MII PHY (3) block.
> Dec 31 16:01:16 lfs tulip1: MII transceiver #1 config 1000 status 7809
> advertising 01e1.
> Dec 31 16:01:16 lfs eth1: Digital DS21143 Tulip rev 65 at b0001480,
> 00:10:E0:00:32:DF, IRQ 20.
> Dec 31 16:01:17 lfs bootlog: Bringing up the eth0 interface...[ OK ]
> Dec 31 16:01:17 lfs bootlog: Adding IPv4 address 172.16.0.99 to the
> eth0 interface...[ OK ]
> Dec 31 16:01:18 lfs bootlog: Setting up default gateway...[ OK ]
> Dec 31 16:01:20 lfs eth0: Setting full-duplex based on MII#1 link
> partner capability of 45e1.
>
> --
> ----
> Jim Gifford
> [email protected]
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2005-04-01 03:52:39

by Jim Gifford

[permalink] [raw]
Subject: Re: 64bit build of tulip driver

Grant
Thanx for your feedback. I got it working, but I don't think the
patch is the best. Here is the patch, and the information, but if you
can recommend a different way to fix it, let me know. The patch was done
by Peter Horton.
Here is the link to the full patch,
http://ftp.jg555.com/patches/raq2/linux/linux-2.6.11.6-raq2_fix-2.patch
but here is the section for this issue
@@ -1628,6 +1631,16 @@
}
}

+#if defined(CONFIG_MIPS_COBALT) && defined(CONFIG_MIPS64)
+ /*
+ * something very bad is happening. without this
+ * cache flush the PHY can't be read. I've tried
+ * various ins & outs, delays etc but only a call
+ * to printk or this flush seems to fix it ... help!
+ */
+ flush_cache_all();
+#endif
+
/* Find the connected MII xcvrs.
Doing this in open() would allow detecting external xcvrs
later, but takes much time. */

>Are there any config option differences?
>e.g. MWI or MMIO options enabled on 64-bit but not 32-bit?
>
>
I verified that there are no differences.

>ISTR to remember submitting a patch so additional data
>gets printed in tulip_stop_rxtx. Here is a reference to the patch
>but I don't think it is relevant to the this problem:
> http://lkml.org/lkml/2004/12/15/119
>
>
>
Applied the patch, here is the output

0000:00:07.0: tulip_stop_rxtx() failed (CSR5 0xf0660000 CSR6 0xb3862002)
0000:00:07.0: tulip_stop_rxtx() failed (CSR5 0xf0660000 CSR6 0xb3862002)
0000:00:07.0: tulip_stop_rxtx() failed (CSR5 0xf0660000 CSR6 0xb3862002)
0000:00:07.0: tulip_stop_rxtx() failed (CSR5 0xf0660000 CSR6 0xb3862002)
0000:00:07.0: tulip_stop_rxtx() failed (CSR5 0xf0660000 CSR6 0xb3862002)
0000:00:07.0: tulip_stop_rxtx() failed (CSR5 0xf0660000 CSR6 0xb3862002)
0000:00:07.0: tulip_stop_rxtx() failed (CSR5 0xf0660000 CSR6 0xb3862002)
0000:00:07.0: tulip_stop_rxtx() failed (CSR5 0xf0660000 CSR6 0xb3862002)

I was able to get some more information on the bootup sequence with the
updates.
Here is the output now from the driver

Linux Tulip driver version 1.1.13 (May 11, 2002)
PCI: Enabling device 0000:00:07.0 (0045 -> 0047)
tulip0: Old format EEPROM on 'Cobalt Microserver' board. Using
substitute media control info.
tulip0: EEPROM default media type Autosense.
tulip0: Index #0 - Media MII (#11) described by a 21142 MII PHY (3) block.
tulip0: ***WARNING***: No MII transceiver found!
eth0: Digital DS21143 Tulip rev 65 at ffffffffb0001400,
00:10:E0:00:32:DE, IRQ 19.
PCI: Enabling device 0000:00:0c.0 (0005 -> 0007)
tulip1: Old format EEPROM on 'Cobalt Microserver' board. Using
substitute media control info.
tulip1: EEPROM default media type Autosense.
tulip1: Index #0 - Media MII (#11) described by a 21142 MII PHY (3) block.
tulip1: ***WARNING***: No MII transceiver found!
eth1: Digital DS21143 Tulip rev 65 at ffffffffb0001480,
00:10:E0:00:32:DF, IRQ 20.


--
----
Jim Gifford
[email protected]

2005-04-01 06:50:59

by Grant Grundler

[permalink] [raw]
Subject: Re: 64bit build of tulip driver

On Thu, Mar 31, 2005 at 07:52:06PM -0800, Jim Gifford wrote:
> Grant
> Thanx for your feedback. I got it working, but I don't think the
> patch is the best. Here is the patch, and the information, but if you
> can recommend a different way to fix it, let me know.

I can not "reccomend" one. I can suggest other things to try
since I'm very skeptical this patch will get accepted by
the maintainer (Jeff Garzik). He's normally wants a much
better explanation of the problem than "this works".


> The patch was done by Peter Horton.
> Here is the link to the full patch,
> http://ftp.jg555.com/patches/raq2/linux/linux-2.6.11.6-raq2_fix-2.patch
> but here is the section for this issue

Jim,
You have other changes to tulip_core.c:
+ /* Avoid a chip errata by prefixing a dummy entr
y. Don't do
+ this on the ULI526X as it triggers a differen
t problem */
....


Picking a few nits:
o comment extends past 80 columns - please wrap before 80 columns
o *Which* chip errata?
o *Which* other problem?
o I prefer diffs with "-p" when reviewing patches so I know which
function is getting mangled.

- /* No media table either */
- tp->flags &= ~HAS_MEDIA_TABLE;
+ /* Ensure our media table fixup get's applied */
+ memcpy(ee_data + 16, ee_data, 8);

This isn't likely to get far either unless it's better explained.
You don't have to explain it to me, now. But have something handy
if you want jgarzik to accept it.


> @@ -1628,6 +1631,16 @@
> }
> }
>
> +#if defined(CONFIG_MIPS_COBALT) && defined(CONFIG_MIPS64)
> + /*
> + * something very bad is happening. without this
> + * cache flush the PHY can't be read. I've tried
> + * various ins & outs, delays etc but only a call
> + * to printk or this flush seems to fix it ... help!
> + */
> + flush_cache_all();
> +#endif

The code immediately before this calls tulip_select_media().
Code paths exist in tulip_select_media() where the last thing the
driver does to the NIC is io_write(). This could easily be a posted
write flush problem. Does replacing flush_cache_all() with
"ioread32(ioaddr + CSR12)" also work?

Can you find out how long one has to wait after banging
on CSR12 before it's safe to call tulip_find_mii()?

How long does flush_cache_all() take in microseconds?

It's possible this is a very fast PPC chip and it's executing the
code path between tulip_select_media() and tulip_find_mii()
faster than the chips can finish dealing with the writes to CSR12.
I'd consider this issue if flushing posted PCI writes doesn't help.

The tulip changes I maintain in parisc-linux port deal with
similar issues where the driver is not following the specified
timing requirements.
Search google for "tulip 802.3 22.2.4 Management functions"
or look into http://cvs.parisc-linux.org/linux-2.6/.


> +
> /* Find the connected MII xcvrs.
> Doing this in open() would allow detecting external xcvrs
> later, but takes much time. */
>
> >Are there any config option differences?
> >e.g. MWI or MMIO options enabled on 64-bit but not 32-bit?
>
> I verified that there are no differences.

ok. thanks.

...
> Applied the patch, here is the output
>
> 0000:00:07.0: tulip_stop_rxtx() failed (CSR5 0xf0660000 CSR6 0xb3862002)
...

Sorry, I don't have time to decode what these mean right now.
But I think the publicly available tulip chips docs sufficiently
explain what the registers mean and what state the chip is in.

> I was able to get some more information on the bootup sequence with the
> updates.
> Here is the output now from the driver
>
> Linux Tulip driver version 1.1.13 (May 11, 2002)
> PCI: Enabling device 0000:00:07.0 (0045 -> 0047)
> tulip0: Old format EEPROM on 'Cobalt Microserver' board. Using
> substitute media control info.
> tulip0: EEPROM default media type Autosense.
> tulip0: Index #0 - Media MII (#11) described by a 21142 MII PHY (3) block.
> tulip0: ***WARNING***: No MII transceiver found!

ok. I assume this is unpatched.

thanks,
grant

> eth0: Digital DS21143 Tulip rev 65 at ffffffffb0001400,
> 00:10:E0:00:32:DE, IRQ 19.
> PCI: Enabling device 0000:00:0c.0 (0005 -> 0007)
> tulip1: Old format EEPROM on 'Cobalt Microserver' board. Using
> substitute media control info.
> tulip1: EEPROM default media type Autosense.
> tulip1: Index #0 - Media MII (#11) described by a 21142 MII PHY (3) block.
> tulip1: ***WARNING***: No MII transceiver found!
> eth1: Digital DS21143 Tulip rev 65 at ffffffffb0001480,
> 00:10:E0:00:32:DF, IRQ 20.
>
>
> --
> ----
> Jim Gifford
> [email protected]

2005-04-01 18:26:22

by Grant Grundler

[permalink] [raw]
Subject: Re: 64bit build of tulip driver

On Fri, Apr 01, 2005 at 08:46:33AM -0800, Jim Gifford wrote:
> >Code paths exist in tulip_select_media() where the last thing the
> >driver does to the NIC is io_write(). This could easily be a posted
> >write flush problem. Does replacing flush_cache_all() with
> >"ioread32(ioaddr + CSR12)" also work?
> >
> >The code immediately before this calls tulip_select_media().
>
> Didn't work,

Can you try replacing flush_cache_all() with the following?
ioread32(ioaddr + CSR12);
udelay(500); /* random delay until someone looks up what is spec'd */

> I'm going to revert back and try your code and see if it
> fixes the issue.

Erm...the code in parisc-linux tree won't have the COBALT hacks.
You might try adding selective bits from the parisc-linux tulip.


That fact the flush_cache_all() changes the behavior made me
wonder if IORESOURCE_CACHEABLE is set in the pci resource.
But that doesn't seem to matter for ppc (32 or 64).
Notes on what I learned below.

arch/ppc64/kernel/iomap.c doesn't look at that flag.
arch/ppc/kernel/io.c:pci_ioremap() has the nice comment:
if (flags & IORESOURCE_MEM)
/* Not checking IORESOURCE_CACHEABLE because PPC does
* not currently distinguish between ioremap and
* ioremap_nocache.
*/
return ioremap(start, len);

ioremap resolves to:
void __iomem *
ioremap64(unsigned long long addr, unsigned long size)
{
return __ioremap(addr, size, _PAGE_NO_CACHE);
}

I *think* (too many ifdefs) ppc64 does the same in arch/ppc64/mm/init.c.
Cacheing is clear not an issue for accessing MMIO space via pci_iomap().

grant

2005-04-01 20:26:18

by Jim Gifford

[permalink] [raw]
Subject: Re: 64bit build of tulip driver

Grant,
Thank you, I took your driver as a reference and added in the cobalt
specifics to the eeprom.c file, works perfectly now.


2005-04-01 22:14:17

by Grant Grundler

[permalink] [raw]
Subject: Re: 64bit build of tulip driver

On Fri, Apr 01, 2005 at 12:23:25PM -0800, Jim Gifford wrote:
> Grant,
> Thank you, I took your driver as a reference and added in the cobalt
> specifics to the eeprom.c file, works perfectly now.

Cool! very welcome.

Can you do me a favor and submit a diff of all the tulip changes
you have at this point back to lkml (and whatever other lists are cc'd)?

jgarzik might accept your bits and ignore the parts that have been
submitted/rejected before. But whatever you post will get archived
with this thread for others to find in the future.

thanks,
grant

2005-04-02 04:38:37

by Jim Gifford

[permalink] [raw]
Subject: Re: 64bit build of tulip driver

With Grant's help I was able to get the tulip driver to work with 64 bit
MIPS.

Again Thanx Grant. Attached is the patch I used.



Attachments:
tulip-kernel[1].patch (6.08 kB)