2008-07-17 12:16:36

by David Miller

[permalink] [raw]
Subject: [PATCH 11/31]: net: Implement simple sw TX hashing.


It just xor hashes over IPv4/IPv6 addresses and ports of transport.

The only assumption it makes is that skb_network_header() is set
correctly.

With bug fixes from Eric Dumazet.

Signed-off-by: David S. Miller <[email protected]>
---
net/core/dev.c | 52 ++++++++++++++++++++++++++++++++++++++++++++++++++++
1 files changed, 52 insertions(+), 0 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 7ca9564..467bfb3 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -121,6 +121,9 @@
#include <linux/ctype.h>
#include <linux/if_arp.h>
#include <linux/if_vlan.h>
+#include <linux/ip.h>
+#include <linux/ipv6.h>
+#include <linux/in.h>

#include "net-sysfs.h"

@@ -1665,6 +1668,53 @@ out_kfree_skb:
* --BLG
*/

+static u16 simple_tx_hash(struct net_device *dev, struct sk_buff *skb)
+{
+ u32 *addr, *ports, hash, ihl;
+ u8 ip_proto;
+ int alen;
+
+ switch (skb->protocol) {
+ case __constant_htons(ETH_P_IP):
+ ip_proto = ip_hdr(skb)->protocol;
+ addr = &ip_hdr(skb)->saddr;
+ ihl = ip_hdr(skb)->ihl;
+ alen = 2;
+ break;
+ case __constant_htons(ETH_P_IPV6):
+ ip_proto = ipv6_hdr(skb)->nexthdr;
+ addr = &ipv6_hdr(skb)->saddr.s6_addr32[0];
+ ihl = (40 >> 2);
+ alen = 8;
+ break;
+ default:
+ return 0;
+ }
+
+ ports = (u32 *) (skb_network_header(skb) + (ihl * 4));
+
+ hash = 0;
+ while (alen--)
+ hash ^= *addr++;
+
+ switch (ip_proto) {
+ case IPPROTO_TCP:
+ case IPPROTO_UDP:
+ case IPPROTO_DCCP:
+ case IPPROTO_ESP:
+ case IPPROTO_AH:
+ case IPPROTO_SCTP:
+ case IPPROTO_UDPLITE:
+ hash ^= *ports;
+ break;
+
+ default:
+ break;
+ }
+
+ return hash % dev->real_num_tx_queues;
+}
+
static struct netdev_queue *dev_pick_tx(struct net_device *dev,
struct sk_buff *skb)
{
@@ -1672,6 +1722,8 @@ static struct netdev_queue *dev_pick_tx(struct net_device *dev,

if (dev->select_queue)
queue_index = dev->select_queue(dev, skb);
+ else if (dev->real_num_tx_queues > 1)
+ queue_index = simple_tx_hash(dev, skb);

skb_set_queue_mapping(skb, queue_index);
return netdev_get_tx_queue(dev, queue_index);
--
1.5.6.2.255.gbed62



2008-07-17 16:02:01

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH 11/31]: net: Implement simple sw TX hashing.

David Miller <[email protected]> writes:

> +static u16 simple_tx_hash(struct net_device *dev, struct sk_buff *skb)
> +{
> + u32 *addr, *ports, hash, ihl;
> + u8 ip_proto;
> + int alen;
> +
> + switch (skb->protocol) {
> + case __constant_htons(ETH_P_IP):
> + ip_proto = ip_hdr(skb)->protocol;
> + addr = &ip_hdr(skb)->saddr;
> + ihl = ip_hdr(skb)->ihl;
> + alen = 2;
> + break;
> + case __constant_htons(ETH_P_IPV6):
> + ip_proto = ipv6_hdr(skb)->nexthdr;
> + addr = &ipv6_hdr(skb)->saddr.s6_addr32[0];
> + ihl = (40 >> 2);
> + alen = 8;
> + break;
> + default:
> + return 0;

Could you add a linuxmib counter for this default event? Just so that
people can diagnose this more easily.

> +
> + ports = (u32 *) (skb_network_header(skb) + (ihl * 4));
> +
> + hash = 0;
> + while (alen--)
> + hash ^= *addr++;

Are you sure the hash behaves the same between big endian and little
endian? Perhaps it would be safer to always convert endian even if
that complicates the code a bit.

> + hash ^= *ports;
> + break;
> +
> + default:
> + break;

And also a linuxmib counter here.

> + }
> +
> + return hash % dev->real_num_tx_queues;

I haven't rechecked it in detail, but I suspect some more
folding of higher bits would be better again for endian safety.

-Andi


2008-08-03 12:35:51

by Brice Goglin

[permalink] [raw]
Subject: Re: [PATCH 11/31]: net: Implement simple sw TX hashing.

Brice Goglin wrote:
> So I wonder if we shouldn't change all this into:
> 1) simple_tx_hash(skb) returns a big protocol-independent integer (like
> a concatenation or basic xor of IP proto/addr/...). It basically just
> converts the skb header in something flat and proto-independent.
> 2.a) if the device provides select_queue(), we pass this integer to it.
> select_queue() either uses the skb headers if it really wants to hash IP
> or IPv6 precisely, or uses our protocol-independent integer and just
> hash it in a dumb way without caring about the protocol hidden behind it.
> 2.b) if there's no select_queue(), we hash the big integer depending on
> the number of tx queues
>

Or (easier) just export simple_tx_hash() to modules so that drivers can
rely on it to hash protocols that they have no clue about?

Brice


2008-08-03 12:16:32

by Brice Goglin

[permalink] [raw]
Subject: Re: [PATCH 11/31]: net: Implement simple sw TX hashing.

David Miller wrote:
> It just xor hashes over IPv4/IPv6 addresses and ports of transport.
>
> The only assumption it makes is that skb_network_header() is set
> correctly.
>

Hey David,

Sorry for the late reply, I didn't have time to look at multiqueue tx
details before. I have some curiosity-questions about your hashing.

> +static u16 simple_tx_hash(struct net_device *dev, struct sk_buff *skb)
> +{
> + u32 *addr, *ports, hash, ihl;
> + u8 ip_proto;
> + int alen;
> +
> + switch (skb->protocol) {
> + case __constant_htons(ETH_P_IP):
> + ip_proto = ip_hdr(skb)->protocol;
> + addr = &ip_hdr(skb)->saddr;
> + ihl = ip_hdr(skb)->ihl;
> + alen = 2;
> + break;
> + case __constant_htons(ETH_P_IPV6):
> + ip_proto = ipv6_hdr(skb)->nexthdr;
> + addr = &ipv6_hdr(skb)->saddr.s6_addr32[0];
> + ihl = (40 >> 2);
> + alen = 8;
> + break;
> + default:
> + return 0;
> + }
>

What's your plan for other protocols here? Should we add an optional
tx_hash() method in struct packet_type?

> @@ -1672,6 +1722,8 @@ static struct netdev_queue *dev_pick_tx(struct net_device *dev,
>
> if (dev->select_queue)
> queue_index = dev->select_queue(dev, skb);
> + else if (dev->real_num_tx_queues > 1)
> + queue_index = simple_tx_hash(dev, skb);
>

Some devices might will add some protocol specific hashing (IP and IPv6)
that could look like what simple_tx_hash() does above, right? But if we
add some new protocols in sample_tx_hash(), it will be ignored by these
devices, and we might have to update the hashing in all select_queue()
methods as well.

So I wonder if we shouldn't change all this into:
1) simple_tx_hash(skb) returns a big protocol-independent integer (like
a concatenation or basic xor of IP proto/addr/...). It basically just
converts the skb header in something flat and proto-independent.
2.a) if the device provides select_queue(), we pass this integer to it.
select_queue() either uses the skb headers if it really wants to hash IP
or IPv6 precisely, or uses our protocol-independent integer and just
hash it in a dumb way without caring about the protocol hidden behind it.
2.b) if there's no select_queue(), we hash the big integer depending on
the number of tx queues

This way, we can easily add some protocol specific hashing to all
drivers without having to modify select_queue() everywhere. Of course,
if we don't care about hashing anything but IP/IPv6, it doesn't matter :)

Brice