2003-01-05 05:00:06

by Martin J. Bligh

[permalink] [raw]
Subject: [PATCH] Fix starfire compiler warning on PAE

diff -urpN -X /home/fletch/.diff.exclude
12-boot_error/drivers/net/starfire.c
19-fix_starfire_warning/drivers/net/starfire.c
--- 12-boot_error/drivers/net/starfire.c Fri Dec 13 23:17:59 2002
+++ 19-fix_starfire_warning/drivers/net/starfire.c Thu Jan 2 22:18:18 2003
@@ -1847,15 +1847,15 @@ static int netdev_close(struct net_devic

#ifdef __i386__
if (debug > 2) {
- printk("\n"KERN_DEBUG" Tx ring at %8.8x:\n",
- np->tx_ring_dma);
+ printk("\n"KERN_DEBUG" Tx ring at %9.9Lx:\n",
+ (u64) np->tx_ring_dma);
for (i = 0; i < 8 /* TX_RING_SIZE is huge! */; i++)
printk(KERN_DEBUG " #%d desc. %8.8x %8.8x -> %8.8x.\n",
i, le32_to_cpu(np->tx_ring[i].status),
le32_to_cpu(np->tx_ring[i].first_addr),
le32_to_cpu(np->tx_done_q[i].status));
- printk(KERN_DEBUG " Rx ring at %8.8x -> %p:\n",
- np->rx_ring_dma, np->rx_done_q);
+ printk(KERN_DEBUG " Rx ring at %9.9Lx -> %p:\n",
+ (u64) np->rx_ring_dma, np->rx_done_q);
if (np->rx_done_q)
for (i = 0; i < 8 /* RX_RING_SIZE */; i++) {
printk(KERN_DEBUG " #%d desc. %8.8x -> %8.8x\n",


2003-01-06 20:48:44

by Ion Badulescu

[permalink] [raw]
Subject: Re: [PATCH] Fix starfire compiler warning on PAE

On Sat, 04 Jan 2003 21:08:33 -0800, Martin J. Bligh <[email protected]> wrote:

> diff -urpN -X /home/fletch/.diff.exclude
> 12-boot_error/drivers/net/starfire.c
> 19-fix_starfire_warning/drivers/net/starfire.c
> --- 12-boot_error/drivers/net/starfire.c Fri Dec 13 23:17:59 2002
> +++ 19-fix_starfire_warning/drivers/net/starfire.c Thu Jan 2 22:18:18 2003

Fix the compiler warning, yes; fix the driver for 64-bit dma_addr_t, no.
It may work with PAE, by chance, if all addresses returned by pci_map_single
and friends are < (1 << 33), but not otherwise.

Jeff already has an updated starfire driver in his queue, complete with
full 64-bit support.

[the cc: to the maintainer is always appreciated...]

Ion
[starfire driver maintainer]

--
It is better to keep your mouth shut and be thought a fool,
than to open it and remove all doubt.

2003-01-06 21:14:31

by Martin J. Bligh

[permalink] [raw]
Subject: Re: [PATCH] Fix starfire compiler warning on PAE

> Fix the compiler warning, yes; fix the driver for 64-bit dma_addr_t, no.
> It may work with PAE, by chance, if all addresses returned by pci_map_single
> and friends are < (1 << 33), but not otherwise.

Odd. It seems to work with PAE now. pci_map_single just casts an address
though ... are the things you're passing it always allocated from ZONE_NORMAL?
I run these all the time on 16Gb machines with 16 processors (ia32 NUMA-Q).

> Jeff already has an updated starfire driver in his queue, complete with
> full 64-bit support.

Cool! Sorry, I've just been seeing that warning for about 1 year, and was
sick of it.

> [the cc: to the maintainer is always appreciated...]

Sorry, missed that. But in that case, I have another question for you ;-)
Why do I get wierd errors like this:

Jan 6 10:09:53 larry kernel: eth0: Increasing Tx FIFO threshold to 80 bytes
Jan 6 10:09:56 larry kernel: eth0: Increasing Tx FIFO threshold to 96 bytes
Jan 6 10:09:56 larry kernel: eth0: Increasing Tx FIFO threshold to 112 bytes
Jan 6 10:10:09 larry kernel: eth0: Increasing Tx FIFO threshold to 128 bytes
Jan 6 10:10:09 larry kernel: eth0: Increasing Tx FIFO threshold to 144 bytes
Jan 6 10:10:12 larry kernel: eth0: Increasing Tx FIFO threshold to 160 bytes
Jan 6 10:10:12 larry kernel: eth0: Increasing Tx FIFO threshold to 176 bytes
Jan 6 10:10:14 larry kernel: eth0: Increasing Tx FIFO threshold to 192 bytes
Jan 6 10:10:30 larry kernel: eth0: Increasing Tx FIFO threshold to 208 bytes
Jan 6 10:10:32 larry kernel: eth0: Increasing Tx FIFO threshold to 224 bytes
Jan 6 10:10:39 larry kernel: eth0: Increasing Tx FIFO threshold to 240 bytes
Jan 6 10:10:39 larry kernel: eth0: Increasing Tx FIFO threshold to 256 bytes
Jan 6 10:10:39 larry kernel: eth0: Increasing Tx FIFO threshold to 272 bytes
Jan 6 10:10:46 larry kernel: eth0: Increasing Tx FIFO threshold to 288 bytes
Jan 6 10:10:46 larry kernel: eth0: Increasing Tx FIFO threshold to 304 bytes
Jan 6 10:10:47 larry kernel: eth0: Increasing Tx FIFO threshold to 320 bytes
Jan 6 10:10:47 larry kernel: eth0: Increasing Tx FIFO threshold to 336 bytes
Jan 6 10:10:57 larry kernel: eth0: Increasing Tx FIFO threshold to 352 bytes
Jan 6 10:10:57 larry kernel: eth0: Increasing Tx FIFO threshold to 368 bytes
Jan 6 10:11:22 larry kernel: eth0: Increasing Tx FIFO threshold to 384 bytes
Jan 6 10:11:37 larry kernel: eth0: Increasing Tx FIFO threshold to 400 bytes
Jan 6 10:11:58 larry kernel: eth0: Increasing Tx FIFO threshold to 416 bytes
Jan 6 10:12:29 larry kernel: eth0: Increasing Tx FIFO threshold to 432 bytes
Jan 6 10:12:29 larry kernel: eth0: Increasing Tx FIFO threshold to 448 bytes
Jan 6 10:12:29 larry kernel: eth0: Increasing Tx FIFO threshold to 464 bytes

I also recall getting errors like "Something Wicked happened", but I
don't seem to be able to find them in the log right now.

Thanks,

M.

2003-01-06 21:27:31

by Ion Badulescu

[permalink] [raw]
Subject: Re: [PATCH] Fix starfire compiler warning on PAE

On Mon, 6 Jan 2003, Martin J. Bligh wrote:

> > Fix the compiler warning, yes; fix the driver for 64-bit dma_addr_t, no.
> > It may work with PAE, by chance, if all addresses returned by pci_map_single
> > and friends are < (1 << 33), but not otherwise.
>
> Odd. It seems to work with PAE now. pci_map_single just casts an address
> though ... are the things you're passing it always allocated from ZONE_NORMAL?
> I run these all the time on 16Gb machines with 16 processors (ia32 NUMA-Q).

I get them from the network stack, and I supposed they're guaranteed to be
in ZONE_NORMAL as long as the adapter doesn't list NETIF_F_HIGHDMA among
its features. So yes, you're right that it will probably work with PAE,
but it won't work with a real 64-bit box, methinks.

> Cool! Sorry, I've just been seeing that warning for about 1 year, and was
> sick of it.

I was kinda busy and didn't get a chance to do much with the driver over
the last 10 months or so... I also didn't have any boxes which could do
real PAE, so I didn't try very hard -- until recently, that is. :-)

> Sorry, missed that. But in that case, I have another question for you ;-)
> Why do I get wierd errors like this:
>
> Jan 6 10:09:53 larry kernel: eth0: Increasing Tx FIFO threshold to 80 bytes
> Jan 6 10:09:56 larry kernel: eth0: Increasing Tx FIFO threshold to 96 bytes

A few of these are mostly normal, it's the card signalling the driver that
it is getting a Tx fifo underrun, and the driver responds by increasing
the threshold at which the card starts transmitting the packet.

> Jan 6 10:12:29 larry kernel: eth0: Increasing Tx FIFO threshold to 448 bytes
> Jan 6 10:12:29 larry kernel: eth0: Increasing Tx FIFO threshold to 464 bytes

These are very high, however; it could be that there really is very high
contention on the PCI bus, but otherwise I can't explain them.

If they stop before reaching 1500, then it's probably ok and just
something you're gonna have to live with. Otherwise it's a bug of some
sort.

> I also recall getting errors like "Something Wicked happened", but I
> don't seem to be able to find them in the log right now.

Yeah, the interrupt status (printed right after the messages) would have
been helpful...

That said, there is a known race condition in the v1.3.6 of the driver,
which could cause timeouts and erors under certain circumstances. It's all
fixed in 1.3.9 (the full 64-bit support version) and 1.4.0 (NAPI support).
Both of those will do real 64-bit transfers, without the need for double
buffering, so it should help on your 16GB boxes.

I could forward you one of those versions, if you want to test it. In
fact, I'd appreciate some testing! :-)

Thanks,
Ion

--
It is better to keep your mouth shut and be thought a fool,
than to open it and remove all doubt.

2003-01-06 21:44:31

by Martin J. Bligh

[permalink] [raw]
Subject: Re: [PATCH] Fix starfire compiler warning on PAE

>> Odd. It seems to work with PAE now. pci_map_single just casts an address
>> though ... are the things you're passing it always allocated from ZONE_NORMAL?
>> I run these all the time on 16Gb machines with 16 processors (ia32 NUMA-Q).
>
> I get them from the network stack, and I supposed they're guaranteed to be
> in ZONE_NORMAL as long as the adapter doesn't list NETIF_F_HIGHDMA among
> its features. So yes, you're right that it will probably work with PAE,
> but it won't work with a real 64-bit box, methinks.

Well it seems like the Right Thing To Do anyway to make it all 64bit clean ;-)

> A few of these are mostly normal, it's the card signalling the driver that
> it is getting a Tx fifo underrun, and the driver responds by increasing
> the threshold at which the card starts transmitting the packet.

Can we not print them onto the console if they're normal then?

>> Jan 6 10:12:29 larry kernel: eth0: Increasing Tx FIFO threshold to 448 bytes
>> Jan 6 10:12:29 larry kernel: eth0: Increasing Tx FIFO threshold to 464 bytes
>
> These are very high, however; it could be that there really is very high
> contention on the PCI bus, but otherwise I can't explain them.
>
> If they stop before reaching 1500, then it's probably ok and just
> something you're gonna have to live with. Otherwise it's a bug of some
> sort.

I think the card took itself offline at this point, so it smells like a bug.
That's only been happening recently though (I've only noticed in the last
week from a year or two of use).

>> I also recall getting errors like "Something Wicked happened", but I
>> don't seem to be able to find them in the log right now.
>
> Yeah, the interrupt status (printed right after the messages) would have
> been helpful...

OK, will try to grab that.

> That said, there is a known race condition in the v1.3.6 of the driver,
> which could cause timeouts and erors under certain circumstances. It's all
> fixed in 1.3.9 (the full 64-bit support version) and 1.4.0 (NAPI support).
> Both of those will do real 64-bit transfers, without the need for double
> buffering, so it should help on your 16GB boxes.
>
> I could forward you one of those versions, if you want to test it. In
> fact, I'd appreciate some testing! :-)

Sure, send me the patch, these boxes bring out races like dying rich aunts
bring out friendly relatives. And I have a cabinet drawer full of starfire
cards ;-)

M.

2003-01-06 22:16:56

by Ion Badulescu

[permalink] [raw]
Subject: Re: [PATCH] Fix starfire compiler warning on PAE

On Mon, 6 Jan 2003, Martin J. Bligh wrote:

> > A few of these are mostly normal, it's the card signalling the driver that
> > it is getting a Tx fifo underrun, and the driver responds by increasing
> > the threshold at which the card starts transmitting the packet.
>
> Can we not print them onto the console if they're normal then?

They're only semi-normal, since they signal some unusual contention on the
PCI bus... but yeah, I guess we could lower their priority to KERN_INFO.

> I think the card took itself offline at this point, so it smells like a bug.
> That's only been happening recently though (I've only noticed in the last
> week from a year or two of use).

I could definitely be a bug (known or not). Anyway, it would be good to
test it with the latest version of the driver.

> Sure, send me the patch, these boxes bring out races like dying rich aunts
> bring out friendly relatives. And I have a cabinet drawer full of starfire
> cards ;-)

All right, I'll forward it off-list.

Thanks,
Ion

--
It is better to keep your mouth shut and be thought a fool,
than to open it and remove all doubt.

2003-02-10 03:24:42

by Martin J. Bligh

[permalink] [raw]
Subject: Testing new starfire driver for 2.5

I've been using your new starfire driver for a couple of weeks now,
and it's in 2.5-mjb ... been testing on the 16x NUMA-Q w/16Gb RAM.

Not only does it work just fine with no problems, all the wierd error
messages I had before went away ;-)

Seems cool to me, are you ready to push this to Linus?

M.

2003-02-10 21:47:20

by Ion Badulescu

[permalink] [raw]
Subject: Re: Testing new starfire driver for 2.5

On Sun, 9 Feb 2003, Martin J. Bligh wrote:

> I've been using your new starfire driver for a couple of weeks now,
> and it's in 2.5-mjb ... been testing on the 16x NUMA-Q w/16Gb RAM.
>
> Not only does it work just fine with no problems, all the wierd error
> messages I had before went away ;-)

Good to hear that...

Which version did you end up using: 1.4.0 or 1.3.9?

> Seems cool to me, are you ready to push this to Linus?

Well, 1.3.9 is already in Marcelo's tree. I'll push 1.4.1 -- which is
1.4.0 plus a compile option for enabling/disabling NAPI -- to Linus as
soon as I finish testing with both compile options.

Thanks,
Ion

--
It is better to keep your mouth shut and be thought a fool,
than to open it and remove all doubt.

2003-02-10 22:23:54

by Martin J. Bligh

[permalink] [raw]
Subject: Re: Testing new starfire driver for 2.5

>> I've been using your new starfire driver for a couple of weeks now,
>> and it's in 2.5-mjb ... been testing on the 16x NUMA-Q w/16Gb RAM.
>>
>> Not only does it work just fine with no problems, all the wierd error
>> messages I had before went away ;-)
>
> Good to hear that...
>
> Which version did you end up using: 1.4.0 or 1.3.9?

1.4.0

>> Seems cool to me, are you ready to push this to Linus?
>
> Well, 1.3.9 is already in Marcelo's tree. I'll push 1.4.1 -- which is
> 1.4.0 plus a compile option for enabling/disabling NAPI -- to Linus as
> soon as I finish testing with both compile options.

Sounds good to me! Thanks for fixing all this,

M.