This patch adds GRO support to mv643xx_eth by making it invoke
napi_gro_receive instead of netif_receive_skb.
Signed-off-by: Soeren Moch <[email protected]>
Signed-off-by: Sebastian Hesselbarth <[email protected]>
---
Cc: "David S. Miller" <[email protected]>
Cc: Lennert Buytenhek <[email protected]>
Cc: Andrew Lunn <[email protected]>
Cc: Jason Cooper <[email protected]>
Cc: Florian Fainelli <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Dale Farnsworth <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
---
drivers/net/ethernet/marvell/mv643xx_eth.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/marvell/mv643xx_eth.c b/drivers/net/ethernet/marvell/mv643xx_eth.c
index 305038f..c850d04 100644
--- a/drivers/net/ethernet/marvell/mv643xx_eth.c
+++ b/drivers/net/ethernet/marvell/mv643xx_eth.c
@@ -604,7 +604,7 @@ static int rxq_process(struct rx_queue *rxq, int budget)
lro_receive_skb(&rxq->lro_mgr, skb, (void *)cmd_sts);
lro_flush_needed = 1;
} else
- netif_receive_skb(skb);
+ napi_gro_receive(&mp->napi, skb);
continue;
--
1.7.10.4
Hi,
On Thu, Apr 11, 2013 at 02:40:23PM +0200, Sebastian Hesselbarth wrote:
> This patch adds GRO support to mv643xx_eth by making it invoke
> napi_gro_receive instead of netif_receive_skb.
>
> Signed-off-by: Soeren Moch <[email protected]>
> Signed-off-by: Sebastian Hesselbarth <[email protected]>
> ---
> Cc: "David S. Miller" <[email protected]>
> Cc: Lennert Buytenhek <[email protected]>
> Cc: Andrew Lunn <[email protected]>
> Cc: Jason Cooper <[email protected]>
> Cc: Florian Fainelli <[email protected]>
> Cc: Benjamin Herrenschmidt <[email protected]>
> Cc: Paul Mackerras <[email protected]>
> Cc: Dale Farnsworth <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> ---
> drivers/net/ethernet/marvell/mv643xx_eth.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/net/ethernet/marvell/mv643xx_eth.c b/drivers/net/ethernet/marvell/mv643xx_eth.c
> index 305038f..c850d04 100644
> --- a/drivers/net/ethernet/marvell/mv643xx_eth.c
> +++ b/drivers/net/ethernet/marvell/mv643xx_eth.c
> @@ -604,7 +604,7 @@ static int rxq_process(struct rx_queue *rxq, int budget)
> lro_receive_skb(&rxq->lro_mgr, skb, (void *)cmd_sts);
> lro_flush_needed = 1;
> } else
> - netif_receive_skb(skb);
> + napi_gro_receive(&mp->napi, skb);
>
> continue;
I remember having experimented with this on 3.6 a few months ago with this
driver and finally switching back to something like this instead which
showed better performance on my tests :
if (skb->ip_summed == CHECKSUM_UNNECESSARY)
napi_gro_receive(napi, skb);
else
netif_receive_skb(skb);
Unfortunately I don't have more details as my commit message was rather
short due to this resulting from experimentation. Did you verify that
you did not lose any performance in various workloads ? I was playing
with bridges at this time, it's possible that I got better performance
on bridging with netif_receive_skb() than with napi_gro_receive().
Regards,
Willy
On Thu, Apr 11, 2013 at 3:13 PM, Willy Tarreau <[email protected]> wrote:
> On Thu, Apr 11, 2013 at 02:40:23PM +0200, Sebastian Hesselbarth wrote:
>> This patch adds GRO support to mv643xx_eth by making it invoke
>> napi_gro_receive instead of netif_receive_skb.
>>
>> Signed-off-by: Soeren Moch <[email protected]>
>> Signed-off-by: Sebastian Hesselbarth <[email protected]>
>> ---
>> Cc: "David S. Miller" <[email protected]>
>> Cc: Lennert Buytenhek <[email protected]>
>> Cc: Andrew Lunn <[email protected]>
>> Cc: Jason Cooper <[email protected]>
>> Cc: Florian Fainelli <[email protected]>
>> Cc: Benjamin Herrenschmidt <[email protected]>
>> Cc: Paul Mackerras <[email protected]>
>> Cc: Dale Farnsworth <[email protected]>
>> Cc: [email protected]
>> Cc: [email protected]
>> Cc: [email protected]
>> Cc: [email protected]
>> ---
>> drivers/net/ethernet/marvell/mv643xx_eth.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/net/ethernet/marvell/mv643xx_eth.c b/drivers/net/ethernet/marvell/mv643xx_eth.c
>> index 305038f..c850d04 100644
>> --- a/drivers/net/ethernet/marvell/mv643xx_eth.c
>> +++ b/drivers/net/ethernet/marvell/mv643xx_eth.c
>> @@ -604,7 +604,7 @@ static int rxq_process(struct rx_queue *rxq, int budget)
>> lro_receive_skb(&rxq->lro_mgr, skb, (void *)cmd_sts);
>> lro_flush_needed = 1;
>> } else
>> - netif_receive_skb(skb);
>> + napi_gro_receive(&mp->napi, skb);
>>
>> continue;
>
> I remember having experimented with this on 3.6 a few months ago with this
> driver and finally switching back to something like this instead which
> showed better performance on my tests :
>
> if (skb->ip_summed == CHECKSUM_UNNECESSARY)
> napi_gro_receive(napi, skb);
> else
> netif_receive_skb(skb);
>
> Unfortunately I don't have more details as my commit message was rather
> short due to this resulting from experimentation. Did you verify that
> you did not lose any performance in various workloads ? I was playing
> with bridges at this time, it's possible that I got better performance
> on bridging with netif_receive_skb() than with napi_gro_receive().
Hi Willy,
I did some simple tests on Dove/Cubox with 'netperf -cCD' and
gso/gro/lro options on
mv643xx_eth. The tests may not be sufficient, as I am not that into
net performance
testing.
I tried todays net-next on top of 3.9-rc6 without any gro patch, with
the initial
patch (Soeren) and your proposed patch (Willy). The results show that
both patches
allow a significant increase in throughput compared to
netif_receive_skb (!gro, !lro)
alone. Having gro with lro disabled gives some 2% more throughput
compared to lro only.
Sebastian
Recv Send Send Utilization Service Demand
Socket Socket Message Elapsed Send Recv Send Recv
Size Size Size Time Throughput local remote local remote
bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB
87380 16384 16384 10.02 615.65 19.15 99.90 5.097
13.293 (3.9-rc6: gso)
87380 16384 16384 10.02 615.82 19.05 99.90 5.067
13.289 (3.9-rc6: gso, gro)
87380 16384 16384 10.03 747.44 23.17 99.80 5.079
10.938 (3.9-rc6: gso, lro)
87380 16384 16384 10.02 745.28 22.57 99.80 4.963
10.970 (3.9.rc6: gso, gro, lro)
87380 16384 16384 10.02 600.34 19.10 99.90 5.211
13.632 (3.9-rc6+soeren: gso)
87380 16384 16384 10.02 764.23 23.42 99.80 5.021
10.698 (3.9-rc6+soeren: gso, gro)
87380 16384 16384 10.02 749.81 23.13 99.80 5.055
10.904 (3.9-rc6+soeren: gso, lro)
87380 16384 16384 10.02 745.84 22.34 99.80 4.907
10.962 (3.9.rc6+soeren: gso, gro, lro)
87380 16384 16384 10.02 605.79 18.79 100.00 5.082
13.523 (3.9-rc6+willy: gso)
87380 16384 16384 10.02 765.64 24.68 99.80 5.281
10.678 (3.9-rc6+willy: gso, gro)
87380 16384 16384 10.02 750.30 26.02 99.80 5.682
10.897 (3.9-rc6+willy: gso, lro)
87380 16384 16384 10.03 749.40 21.86 99.80 4.778
10.910 (3.9.rc6+willy: gso, gro, lro)
Hi Sebastian,
On Thu, Apr 11, 2013 at 04:47:49PM +0200, Sebastian Hesselbarth wrote:
> I did some simple tests on Dove/Cubox with 'netperf -cCD' and
> gso/gro/lro options on
> mv643xx_eth. The tests may not be sufficient, as I am not that into
> net performance testing.
In fact the difference only happens when the NIC has not verified the
checksum itself IIRC, which should be for non-IPv4 traffic. I agree
that it's not easy to test a bridge with a cubox which has a single
port :-) Maybe you'll see a difference in IPv6 traffic or with VLAN
traffic, as I seem to remember this chip does not do cksum offloading
on VLANs, but I could be wrong.
> I tried todays net-next on top of 3.9-rc6 without any gro patch, with
> the initial
> patch (Soeren) and your proposed patch (Willy). The results show that
> both patches
> allow a significant increase in throughput compared to
> netif_receive_skb (!gro, !lro)
> alone. Having gro with lro disabled gives some 2% more throughput
> compared to lro only.
Indeed this is consistent with my memories, since Eric improved the
GRO path, it became faster than LRO on this chip.
Regards,
Willy
On Thu, Apr 11, 2013 at 5:03 PM, Willy Tarreau <[email protected]> wrote:
> On Thu, Apr 11, 2013 at 04:47:49PM +0200, Sebastian Hesselbarth wrote:
>> I tried todays net-next on top of 3.9-rc6 without any gro patch, with
>> the initial
>> patch (Soeren) and your proposed patch (Willy). The results show that
>> both patches
>> allow a significant increase in throughput compared to
>> netif_receive_skb (!gro, !lro)
>> alone. Having gro with lro disabled gives some 2% more throughput
>> compared to lro only.
>
> Indeed this is consistent with my memories, since Eric improved the
> GRO path, it became faster than LRO on this chip.
I don't have a strong opinion on whether Soeren's or your proposal should
be submitted. But I insist on having one of them in, as GRO significantly
improves the common use case, is enabled by default, and not as
constrained as LRO.
Sebastian
On Thu, Apr 11, 2013 at 05:27:03PM +0200, Sebastian Hesselbarth wrote:
> I don't have a strong opinion on whether Soeren's or your proposal should
> be submitted. But I insist on having one of them in, as GRO significantly
> improves the common use case, is enabled by default, and not as
> constrained as LRO.
I agree, use yours first, but we should keep an eye on this. Since you have
everything to run a test, please try to see if you can get netperf to run
over IPv6, I'm sure the NIC doesn't handle it.
Willy
On Thu, 2013-04-11 at 17:32 +0200, Willy Tarreau wrote:
> On Thu, Apr 11, 2013 at 05:27:03PM +0200, Sebastian Hesselbarth wrote:
> > I don't have a strong opinion on whether Soeren's or your proposal should
> > be submitted. But I insist on having one of them in, as GRO significantly
> > improves the common use case, is enabled by default, and not as
> > constrained as LRO.
>
> I agree, use yours first, but we should keep an eye on this. Since you have
> everything to run a test, please try to see if you can get netperf to run
> over IPv6, I'm sure the NIC doesn't handle it.
Willy, testing the checksum in the NIC driver itself prevents the stack
doing GRO even if the NIC could not checksum the packet, as in GRE
tunneling for example.
So Sebastien patch is better IMHO : Just call the napi gro handler and
let core stack handles the details ;)
On Thu, 2013-04-11 at 14:40 +0200, Sebastian Hesselbarth wrote:
> This patch adds GRO support to mv643xx_eth by making it invoke
> napi_gro_receive instead of netif_receive_skb.
>
> Signed-off-by: Soeren Moch <[email protected]>
> Signed-off-by: Sebastian Hesselbarth <[email protected]>
> ---
> Cc: "David S. Miller" <[email protected]>
> Cc: Lennert Buytenhek <[email protected]>
> Cc: Andrew Lunn <[email protected]>
> Cc: Jason Cooper <[email protected]>
> Cc: Florian Fainelli <[email protected]>
> Cc: Benjamin Herrenschmidt <[email protected]>
> Cc: Paul Mackerras <[email protected]>
> Cc: Dale Farnsworth <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> ---
> drivers/net/ethernet/marvell/mv643xx_eth.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/net/ethernet/marvell/mv643xx_eth.c b/drivers/net/ethernet/marvell/mv643xx_eth.c
> index 305038f..c850d04 100644
> --- a/drivers/net/ethernet/marvell/mv643xx_eth.c
> +++ b/drivers/net/ethernet/marvell/mv643xx_eth.c
> @@ -604,7 +604,7 @@ static int rxq_process(struct rx_queue *rxq, int budget)
> lro_receive_skb(&rxq->lro_mgr, skb, (void *)cmd_sts);
> lro_flush_needed = 1;
> } else
> - netif_receive_skb(skb);
> + napi_gro_receive(&mp->napi, skb);
>
> continue;
>
Acked-by: Eric Dumazet <[email protected]>
On Thu, Apr 11, 2013 at 08:54:35AM -0700, Eric Dumazet wrote:
> On Thu, 2013-04-11 at 17:32 +0200, Willy Tarreau wrote:
> > On Thu, Apr 11, 2013 at 05:27:03PM +0200, Sebastian Hesselbarth wrote:
> > > I don't have a strong opinion on whether Soeren's or your proposal should
> > > be submitted. But I insist on having one of them in, as GRO significantly
> > > improves the common use case, is enabled by default, and not as
> > > constrained as LRO.
> >
> > I agree, use yours first, but we should keep an eye on this. Since you have
> > everything to run a test, please try to see if you can get netperf to run
> > over IPv6, I'm sure the NIC doesn't handle it.
>
> Willy, testing the checksum in the NIC driver itself prevents the stack
> doing GRO even if the NIC could not checksum the packet, as in GRE
> tunneling for example.
>
> So Sebastien patch is better IMHO : Just call the napi gro handler and
> let core stack handles the details ;)
OK, that makes sense indeed, I didn't think about this case. All
I remember was that the old call achieved a higher packet rate
than napi_gro_receive, but it was on an older kernel and I can't
be more specifics after several months :-/
Cheers,
Willy
On Thu, 2013-04-11 at 18:02 +0200, Willy Tarreau wrote:
> OK, that makes sense indeed, I didn't think about this case. All
> I remember was that the old call achieved a higher packet rate
> than napi_gro_receive, but it was on an older kernel and I can't
> be more specifics after several months :-/
Its probably true that the GRO handler consumes more cpu for packets
that cant be aggregated in the end.
Thats a trade off, and maybe we could add in the core stack a device
feature to instruct gro handler to do a short cut for packets with no
checksum. Or better a sysctl so that a static_branch can be used.
On Thu, Apr 11, 2013 at 5:32 PM, Willy Tarreau <[email protected]> wrote:
> On Thu, Apr 11, 2013 at 05:27:03PM +0200, Sebastian Hesselbarth wrote:
>> I don't have a strong opinion on whether Soeren's or your proposal should
>> be submitted. But I insist on having one of them in, as GRO significantly
>> improves the common use case, is enabled by default, and not as
>> constrained as LRO.
>
> I agree, use yours first, but we should keep an eye on this. Since you have
> everything to run a test, please try to see if you can get netperf to run
> over IPv6, I'm sure the NIC doesn't handle it.
Willy,
out of curiosity I replayed all tests using netperf/netserver with -6 which
enables ipv6. The overall results remain quite the same here:
enabling support for GRO gives a huge improvement in achievable
throughput, and the difference between Soeren's and your patch is
neglectible.
Sebastian
On Thu, Apr 11, 2013 at 06:59:11PM +0200, Sebastian Hesselbarth wrote:
> On Thu, Apr 11, 2013 at 5:32 PM, Willy Tarreau <[email protected]> wrote:
> > On Thu, Apr 11, 2013 at 05:27:03PM +0200, Sebastian Hesselbarth wrote:
> >> I don't have a strong opinion on whether Soeren's or your proposal should
> >> be submitted. But I insist on having one of them in, as GRO significantly
> >> improves the common use case, is enabled by default, and not as
> >> constrained as LRO.
> >
> > I agree, use yours first, but we should keep an eye on this. Since you have
> > everything to run a test, please try to see if you can get netperf to run
> > over IPv6, I'm sure the NIC doesn't handle it.
>
> Willy,
>
> out of curiosity I replayed all tests using netperf/netserver with -6 which
> enables ipv6. The overall results remain quite the same here:
> enabling support for GRO gives a huge improvement in achievable
> throughput, and the difference between Soeren's and your patch is
> neglectible.
Perfect, thank you for testing this !
Best regards,
Willy
From: Sebastian Hesselbarth <[email protected]>
Date: Thu, 11 Apr 2013 17:27:03 +0200
> On Thu, Apr 11, 2013 at 5:03 PM, Willy Tarreau <[email protected]> wrote:
>> On Thu, Apr 11, 2013 at 04:47:49PM +0200, Sebastian Hesselbarth wrote:
>>> I tried todays net-next on top of 3.9-rc6 without any gro patch, with
>>> the initial
>>> patch (Soeren) and your proposed patch (Willy). The results show that
>>> both patches
>>> allow a significant increase in throughput compared to
>>> netif_receive_skb (!gro, !lro)
>>> alone. Having gro with lro disabled gives some 2% more throughput
>>> compared to lro only.
>>
>> Indeed this is consistent with my memories, since Eric improved the
>> GRO path, it became faster than LRO on this chip.
>
> I don't have a strong opinion on whether Soeren's or your proposal should
> be submitted. But I insist on having one of them in, as GRO significantly
> improves the common use case, is enabled by default, and not as
> constrained as LRO.
I think, as per other drivers, LRO should be eliminated completely from
all drivers, including this one, and GRO used exclusively instead.
On Thu, 2013-04-11 at 13:31 -0400, David Miller wrote:
> I think, as per other drivers, LRO should be eliminated completely from
> all drivers, including this one, and GRO used exclusively instead.
This would be awesome.
AFAIK current LRO users are :
drivers/infiniband/hw/nes/nes_hw.c
drivers/net/ethernet/marvell/mv643xx_eth.c
drivers/net/ethernet/pasemi/pasemi_mac.c
On Thu, Apr 11, 2013 at 01:31:19PM -0400, David Miller wrote:
> From: Sebastian Hesselbarth <[email protected]>
> Date: Thu, 11 Apr 2013 17:27:03 +0200
>
> > On Thu, Apr 11, 2013 at 5:03 PM, Willy Tarreau <[email protected]> wrote:
> >> On Thu, Apr 11, 2013 at 04:47:49PM +0200, Sebastian Hesselbarth wrote:
> >>> I tried todays net-next on top of 3.9-rc6 without any gro patch, with
> >>> the initial
> >>> patch (Soeren) and your proposed patch (Willy). The results show that
> >>> both patches
> >>> allow a significant increase in throughput compared to
> >>> netif_receive_skb (!gro, !lro)
> >>> alone. Having gro with lro disabled gives some 2% more throughput
> >>> compared to lro only.
> >>
> >> Indeed this is consistent with my memories, since Eric improved the
> >> GRO path, it became faster than LRO on this chip.
> >
> > I don't have a strong opinion on whether Soeren's or your proposal should
> > be submitted. But I insist on having one of them in, as GRO significantly
> > improves the common use case, is enabled by default, and not as
> > constrained as LRO.
>
> I think, as per other drivers, LRO should be eliminated completely from
> all drivers, including this one, and GRO used exclusively instead.
Eric provided me with one such experimental patch in the past for this
driver. It worked for me but we never tried to clean it up to propose
it for inclusion.
If anyone is interested, I might still have it in experimental shape.
Willy
On Thu, 2013-04-11 at 14:40 +0200, Sebastian Hesselbarth wrote:
> This patch adds GRO support to mv643xx_eth by making it invoke
> napi_gro_receive instead of netif_receive_skb.
The inet_lro support should be removed at the same time; inet_lro is now
deprecated and there should be no need to keep both of them.
Ben.
> Signed-off-by: Soeren Moch <[email protected]>
> Signed-off-by: Sebastian Hesselbarth <[email protected]>
> ---
> Cc: "David S. Miller" <[email protected]>
> Cc: Lennert Buytenhek <[email protected]>
> Cc: Andrew Lunn <[email protected]>
> Cc: Jason Cooper <[email protected]>
> Cc: Florian Fainelli <[email protected]>
> Cc: Benjamin Herrenschmidt <[email protected]>
> Cc: Paul Mackerras <[email protected]>
> Cc: Dale Farnsworth <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> ---
> drivers/net/ethernet/marvell/mv643xx_eth.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/net/ethernet/marvell/mv643xx_eth.c b/drivers/net/ethernet/marvell/mv643xx_eth.c
> index 305038f..c850d04 100644
> --- a/drivers/net/ethernet/marvell/mv643xx_eth.c
> +++ b/drivers/net/ethernet/marvell/mv643xx_eth.c
> @@ -604,7 +604,7 @@ static int rxq_process(struct rx_queue *rxq, int budget)
> lro_receive_skb(&rxq->lro_mgr, skb, (void *)cmd_sts);
> lro_flush_needed = 1;
> } else
> - netif_receive_skb(skb);
> + napi_gro_receive(&mp->napi, skb);
>
> continue;
>
--
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
On Thu, 2013-04-11 at 19:51 +0200, Willy Tarreau wrote:
> Eric provided me with one such experimental patch in the past for this
> driver. It worked for me but we never tried to clean it up to propose
> it for inclusion.
>
> If anyone is interested, I might still have it in experimental shape.
We are interested a lot ;)
Thanks !
On Thu, Apr 11, 2013 at 10:59:15AM -0700, Eric Dumazet wrote:
> On Thu, 2013-04-11 at 19:51 +0200, Willy Tarreau wrote:
>
> > Eric provided me with one such experimental patch in the past for this
> > driver. It worked for me but we never tried to clean it up to propose
> > it for inclusion.
> >
> > If anyone is interested, I might still have it in experimental shape.
>
> We are interested a lot ;)
Just found it in one of my local branches :-)
Here it is. It was experimental. At first glance, I'm seeing it does not
remove NETIF_F_LRO as we used it for experimentation only, but that's
a trivial thing to fix.
Original work by you Eric, tested by me on 3.6.
diff --git a/drivers/net/ethernet/marvell/mv643xx_eth.c b/drivers/net/ethernet/marvell/mv643xx_eth.c
index ea2cedc..81a6cb0 100644
--- a/drivers/net/ethernet/marvell/mv643xx_eth.c
+++ b/drivers/net/ethernet/marvell/mv643xx_eth.c
@@ -55,9 +55,9 @@
#include <linux/mv643xx_eth.h>
#include <linux/io.h>
#include <linux/types.h>
-#include <linux/inet_lro.h>
#include <linux/slab.h>
#include <linux/clk.h>
+#include <linux/interrupt.h>
static char mv643xx_eth_driver_name[] = "mv643xx_eth";
static char mv643xx_eth_driver_version[] = "1.4";
@@ -343,12 +343,6 @@ struct mib_counters {
u32 rx_overrun;
};
-struct lro_counters {
- u32 lro_aggregated;
- u32 lro_flushed;
- u32 lro_no_desc;
-};
-
struct rx_queue {
int index;
@@ -363,8 +357,6 @@ struct rx_queue {
int rx_desc_area_size;
void **rx_data;
unsigned int frag_size;
- struct net_lro_mgr lro_mgr;
- struct net_lro_desc lro_arr[8];
};
struct tx_queue {
@@ -400,8 +392,6 @@ struct mv643xx_eth_private {
spinlock_t mib_counters_lock;
struct mib_counters mib_counters;
- struct lro_counters lro_counters;
-
struct work_struct tx_timeout_task;
struct napi_struct napi;
@@ -533,33 +523,6 @@ static void txq_maybe_wake(struct tx_queue *txq)
}
-/* rx napi ******************************************************************/
-static int
-mv643xx_get_skb_header(struct sk_buff *skb, void **iphdr, void **tcph,
- u64 *hdr_flags, void *priv)
-{
- unsigned long cmd_sts = (unsigned long)priv;
-
- /*
- * Make sure that this packet is Ethernet II, is not VLAN
- * tagged, is IPv4, has a valid IP header, and is TCP.
- */
- if ((cmd_sts & (RX_IP_HDR_OK | RX_PKT_IS_IPV4 |
- RX_PKT_IS_ETHERNETV2 | RX_PKT_LAYER4_TYPE_MASK |
- RX_PKT_IS_VLAN_TAGGED)) !=
- (RX_IP_HDR_OK | RX_PKT_IS_IPV4 |
- RX_PKT_IS_ETHERNETV2 | RX_PKT_LAYER4_TYPE_TCP_IPV4))
- return -1;
-
- skb_reset_network_header(skb);
- skb_set_transport_header(skb, ip_hdrlen(skb));
- *iphdr = ip_hdr(skb);
- *tcph = tcp_hdr(skb);
- *hdr_flags = LRO_IPV4 | LRO_TCP;
-
- return 0;
-}
-
static void mv_frag_free(const struct rx_queue *rxq, void *data)
{
if (rxq->frag_size)
@@ -568,14 +531,12 @@ static void mv_frag_free(const struct rx_queue *rxq, void *data)
kfree(data);
}
-static int rxq_process(struct rx_queue *rxq, int budget)
+static int rxq_process(struct napi_struct *napi, struct rx_queue *rxq, int budget)
{
struct mv643xx_eth_private *mp = rxq_to_mp(rxq);
struct net_device_stats *stats = &mp->dev->stats;
- int lro_flush_needed;
int rx;
- lro_flush_needed = 0;
rx = 0;
while (rx < budget && rxq->rx_desc_count) {
struct rx_desc *rx_desc;
@@ -646,12 +607,7 @@ static int rxq_process(struct rx_queue *rxq, int budget)
skb->ip_summed = CHECKSUM_UNNECESSARY;
skb->protocol = eth_type_trans(skb, mp->dev);
- if (skb->dev->features & NETIF_F_LRO &&
- skb->ip_summed == CHECKSUM_UNNECESSARY) {
- lro_receive_skb(&rxq->lro_mgr, skb, (void *)cmd_sts);
- lro_flush_needed = 1;
- } else
- netif_receive_skb(skb);
+ napi_gro_receive(napi, skb);
continue;
@@ -671,9 +627,6 @@ err:
dev_kfree_skb(skb);
}
- if (lro_flush_needed)
- lro_flush_all(&rxq->lro_mgr);
-
if (rx < budget)
mp->work_rx &= ~(1 << rxq->index);
@@ -723,7 +676,6 @@ static int rxq_refill(struct rx_queue *rxq, int budget)
rxq->rx_data[rx] = data;
wmb();
rx_desc->cmd_sts = BUFFER_OWNED_BY_DMA | RX_ENABLE_INTERRUPT;
-
}
if (refilled < budget)
@@ -1214,26 +1166,6 @@ static struct net_device_stats *mv643xx_eth_get_stats(struct net_device *dev)
return stats;
}
-static void mv643xx_eth_grab_lro_stats(struct mv643xx_eth_private *mp)
-{
- u32 lro_aggregated = 0;
- u32 lro_flushed = 0;
- u32 lro_no_desc = 0;
- int i;
-
- for (i = 0; i < mp->rxq_count; i++) {
- struct rx_queue *rxq = mp->rxq + i;
-
- lro_aggregated += rxq->lro_mgr.stats.aggregated;
- lro_flushed += rxq->lro_mgr.stats.flushed;
- lro_no_desc += rxq->lro_mgr.stats.no_desc;
- }
-
- mp->lro_counters.lro_aggregated = lro_aggregated;
- mp->lro_counters.lro_flushed = lro_flushed;
- mp->lro_counters.lro_no_desc = lro_no_desc;
-}
-
static inline u32 mib_read(struct mv643xx_eth_private *mp, int offset)
{
return rdl(mp, MIB_COUNTERS(mp->port_num) + offset);
@@ -1397,10 +1329,6 @@ struct mv643xx_eth_stats {
{ #m, FIELD_SIZEOF(struct mib_counters, m), \
-1, offsetof(struct mv643xx_eth_private, mib_counters.m) }
-#define LROSTAT(m) \
- { #m, FIELD_SIZEOF(struct lro_counters, m), \
- -1, offsetof(struct mv643xx_eth_private, lro_counters.m) }
-
static const struct mv643xx_eth_stats mv643xx_eth_stats[] = {
SSTAT(rx_packets),
SSTAT(tx_packets),
@@ -1442,9 +1370,6 @@ static const struct mv643xx_eth_stats mv643xx_eth_stats[] = {
MIBSTAT(late_collision),
MIBSTAT(rx_discard),
MIBSTAT(rx_overrun),
- LROSTAT(lro_aggregated),
- LROSTAT(lro_flushed),
- LROSTAT(lro_no_desc),
};
static int
@@ -1642,7 +1567,6 @@ static void mv643xx_eth_get_ethtool_stats(struct net_device *dev,
mv643xx_eth_get_stats(dev);
mib_counters_update(mp);
- mv643xx_eth_grab_lro_stats(mp);
for (i = 0; i < ARRAY_SIZE(mv643xx_eth_stats); i++) {
const struct mv643xx_eth_stats *stat;
@@ -1915,19 +1839,6 @@ static int rxq_init(struct mv643xx_eth_private *mp, int index)
nexti * sizeof(struct rx_desc);
}
- rxq->lro_mgr.dev = mp->dev;
- memset(&rxq->lro_mgr.stats, 0, sizeof(rxq->lro_mgr.stats));
- rxq->lro_mgr.features = LRO_F_NAPI;
- rxq->lro_mgr.ip_summed = CHECKSUM_UNNECESSARY;
- rxq->lro_mgr.ip_summed_aggr = CHECKSUM_UNNECESSARY;
- rxq->lro_mgr.max_desc = ARRAY_SIZE(rxq->lro_arr);
- rxq->lro_mgr.max_aggr = 32;
- rxq->lro_mgr.frag_align_pad = 0;
- rxq->lro_mgr.lro_arr = rxq->lro_arr;
- rxq->lro_mgr.get_skb_header = mv643xx_get_skb_header;
-
- memset(&rxq->lro_arr, 0, sizeof(rxq->lro_arr));
-
return 0;
@@ -2192,7 +2103,7 @@ static int mv643xx_eth_poll(struct napi_struct *napi, int budget)
work_done += txq_reclaim(mp->txq + queue, work_tbd, 0);
txq_maybe_wake(mp->txq + queue);
} else if (mp->work_rx & queue_mask) {
- work_done += rxq_process(mp->rxq + queue, work_tbd);
+ work_done += rxq_process(napi, mp->rxq + queue, work_tbd);
} else if (!mp->oom && (mp->work_rx_refill & queue_mask)) {
work_done += rxq_refill(mp->rxq + queue, work_tbd);
} else {
Regards,
Willy
On 04/11/2013 07:55 PM, Ben Hutchings wrote:
> On Thu, 2013-04-11 at 14:40 +0200, Sebastian Hesselbarth wrote:
>> This patch adds GRO support to mv643xx_eth by making it invoke
>> napi_gro_receive instead of netif_receive_skb.
>
> The inet_lro support should be removed at the same time; inet_lro is now
> deprecated and there should be no need to keep both of them.
Ben,
I agree on removing it asap, but I also like to see GRO support
in. Maybe, we postpone lro removal just a little bit. I have just
started to look at mv643xx_eth and remember Marvell Orion SoCs are
not the only platform using it. I haven't heard a single word from
ppc guys, yet.
Willy just posted an experimental patch to remove lro. I'll have
a look at it and test it out on Dove.
Sebastian
From: Sebastian Hesselbarth <[email protected]>
Date: Thu, 11 Apr 2013 14:40:23 +0200
> This patch adds GRO support to mv643xx_eth by making it invoke
> napi_gro_receive instead of netif_receive_skb.
>
> Signed-off-by: Soeren Moch <[email protected]>
> Signed-off-by: Sebastian Hesselbarth <[email protected]>
Applied.