2023-10-16 07:16:10

by Beniamino Galvani

[permalink] [raw]
Subject: [PATCH net-next v2 0/7] net: consolidate IPv4 route lookup for UDP tunnels

At the moment different UDP tunnels rely on different functions for
IPv4 route lookup, and those functions all implement the same
logic. Only bareudp uses the generic ip_route_output_tunnel(), while
geneve and vxlan basically duplicate it slightly differently.

This series first extends the generic lookup function so that it is
suitable for all UDP tunnel implementations. Then, bareudp, geneve and
vxlan are adapted to use them.

This results in code with less duplication and hopefully better
maintainability.

After this series is merged, IPv6 will be converted in a similar way.

Changelog:
v2
- fix compilation with IPv6 disabled

Beniamino Galvani (7):
ipv4: rename and move ip_route_output_tunnel()
ipv4: remove "proto" argument from udp_tunnel_dst_lookup()
ipv4: add new arguments to udp_tunnel_dst_lookup()
ipv4: use tunnel flow flags for tunnel route lookups
geneve: add dsfield helper function
geneve: use generic function for tunnel IPv4 route lookup
vxlan: use generic function for tunnel IPv4 route lookup

drivers/net/bareudp.c | 11 ++--
drivers/net/geneve.c | 111 ++++++++++++--------------------
drivers/net/vxlan/vxlan_core.c | 114 ++++++++++++---------------------
include/net/route.h | 6 --
include/net/udp_tunnel.h | 8 +++
net/ipv4/route.c | 48 --------------
net/ipv4/udp_tunnel_core.c | 49 ++++++++++++++
7 files changed, 147 insertions(+), 200 deletions(-)

--
2.40.1


2023-10-16 07:16:15

by Beniamino Galvani

[permalink] [raw]
Subject: [PATCH net-next v2 3/7] ipv4: add new arguments to udp_tunnel_dst_lookup()

We want to make the function more generic so that it can be used by
other UDP tunnel implementations such as geneve and vxlan. To do that,
add the following arguments:

- source and destination UDP port;
- ifindex of the output interface, needed by vxlan;
- the tos, because in some cases it is not taken from struct
ip_tunnel_info (for example, when it's inherited from the inner
packet);
- the dst cache, because not all tunnel types (e.g. vxlan) want to
use the one from struct ip_tunnel_info.

With these parameters, the function no longer needs the full struct
ip_tunnel_info as argument and we can pass only the relevant part of
it (struct ip_tunnel_key).

Suggested-by: Guillaume Nault <[email protected]>
Signed-off-by: Beniamino Galvani <[email protected]>
Reviewed-by: David Ahern <[email protected]>
---
drivers/net/bareudp.c | 11 +++++++----
include/net/udp_tunnel.h | 8 +++++---
net/ipv4/udp_tunnel_core.c | 26 +++++++++++++-------------
3 files changed, 25 insertions(+), 20 deletions(-)

diff --git a/drivers/net/bareudp.c b/drivers/net/bareudp.c
index 6af67cac6bde..47a9c2a5583c 100644
--- a/drivers/net/bareudp.c
+++ b/drivers/net/bareudp.c
@@ -306,8 +306,10 @@ static int bareudp_xmit_skb(struct sk_buff *skb, struct net_device *dev,
if (!sock)
return -ESHUTDOWN;

- rt = udp_tunnel_dst_lookup(skb, dev, bareudp->net, &saddr, info,
- use_cache);
+ rt = udp_tunnel_dst_lookup(skb, dev, bareudp->net, 0, &saddr, &info->key,
+ 0, 0, key->tos,
+ use_cache ?
+ (struct dst_cache *)&info->dst_cache : NULL);

if (IS_ERR(rt))
return PTR_ERR(rt);
@@ -483,8 +485,9 @@ static int bareudp_fill_metadata_dst(struct net_device *dev,
struct rtable *rt;
__be32 saddr;

- rt = udp_tunnel_dst_lookup(skb, dev, bareudp->net, &saddr,
- info, use_cache);
+ rt = udp_tunnel_dst_lookup(skb, dev, bareudp->net, 0, &saddr,
+ &info->key, 0, 0, info->key.tos,
+ use_cache ? &info->dst_cache : NULL);
if (IS_ERR(rt))
return PTR_ERR(rt);

diff --git a/include/net/udp_tunnel.h b/include/net/udp_tunnel.h
index 8f110dbd3784..4d0578fab01a 100644
--- a/include/net/udp_tunnel.h
+++ b/include/net/udp_tunnel.h
@@ -164,9 +164,11 @@ void udp_tunnel_sock_release(struct socket *sock);

struct rtable *udp_tunnel_dst_lookup(struct sk_buff *skb,
struct net_device *dev,
- struct net *net, __be32 *saddr,
- const struct ip_tunnel_info *info,
- bool use_cache);
+ struct net *net, int oif,
+ __be32 *saddr,
+ const struct ip_tunnel_key *key,
+ __be16 sport, __be16 dport, u8 tos,
+ struct dst_cache *dst_cache);

struct metadata_dst *udp_tun_rx_dst(struct sk_buff *skb, unsigned short family,
__be16 flags, __be64 tunnel_id,
diff --git a/net/ipv4/udp_tunnel_core.c b/net/ipv4/udp_tunnel_core.c
index 9b0cfd72d5fd..494685e82856 100644
--- a/net/ipv4/udp_tunnel_core.c
+++ b/net/ipv4/udp_tunnel_core.c
@@ -206,31 +206,31 @@ EXPORT_SYMBOL_GPL(udp_tun_rx_dst);

struct rtable *udp_tunnel_dst_lookup(struct sk_buff *skb,
struct net_device *dev,
- struct net *net, __be32 *saddr,
- const struct ip_tunnel_info *info,
- bool use_cache)
+ struct net *net, int oif,
+ __be32 *saddr,
+ const struct ip_tunnel_key *key,
+ __be16 sport, __be16 dport, u8 tos,
+ struct dst_cache *dst_cache)
{
-#ifdef CONFIG_DST_CACHE
- struct dst_cache *dst_cache;
-#endif
struct rtable *rt = NULL;
struct flowi4 fl4;
- __u8 tos;

#ifdef CONFIG_DST_CACHE
- dst_cache = (struct dst_cache *)&info->dst_cache;
- if (use_cache) {
+ if (dst_cache) {
rt = dst_cache_get_ip4(dst_cache, saddr);
if (rt)
return rt;
}
#endif
+
memset(&fl4, 0, sizeof(fl4));
fl4.flowi4_mark = skb->mark;
fl4.flowi4_proto = IPPROTO_UDP;
- fl4.daddr = info->key.u.ipv4.dst;
- fl4.saddr = info->key.u.ipv4.src;
- tos = info->key.tos;
+ fl4.flowi4_oif = oif;
+ fl4.daddr = key->u.ipv4.dst;
+ fl4.saddr = key->u.ipv4.src;
+ fl4.fl4_dport = dport;
+ fl4.fl4_sport = sport;
fl4.flowi4_tos = RT_TOS(tos);

rt = ip_route_output_key(net, &fl4);
@@ -244,7 +244,7 @@ struct rtable *udp_tunnel_dst_lookup(struct sk_buff *skb,
return ERR_PTR(-ELOOP);
}
#ifdef CONFIG_DST_CACHE
- if (use_cache)
+ if (dst_cache)
dst_cache_set_ip4(dst_cache, &rt->dst, fl4.saddr);
#endif
*saddr = fl4.saddr;
--
2.40.1

2023-10-16 07:16:22

by Beniamino Galvani

[permalink] [raw]
Subject: [PATCH net-next v2 1/7] ipv4: rename and move ip_route_output_tunnel()

At the moment ip_route_output_tunnel() is used only by bareudp.
Ideally, other UDP tunnel implementations should use it, but to do so
the function needs to accept new parameters that are specific for UDP
tunnels, such as the ports.

Prepare for these changes by renaming the function to
udp_tunnel_dst_lookup() and move it to file
net/ipv4/udp_tunnel_core.c.

Suggested-by: Guillaume Nault <[email protected]>
Signed-off-by: Beniamino Galvani <[email protected]>
Reviewed-by: David Ahern <[email protected]>
---
drivers/net/bareudp.c | 8 +++----
include/net/route.h | 6 -----
include/net/udp_tunnel.h | 6 +++++
net/ipv4/route.c | 48 --------------------------------------
net/ipv4/udp_tunnel_core.c | 48 ++++++++++++++++++++++++++++++++++++++
5 files changed, 58 insertions(+), 58 deletions(-)

diff --git a/drivers/net/bareudp.c b/drivers/net/bareudp.c
index 683203f87ae2..63fc32fa1af5 100644
--- a/drivers/net/bareudp.c
+++ b/drivers/net/bareudp.c
@@ -306,8 +306,8 @@ static int bareudp_xmit_skb(struct sk_buff *skb, struct net_device *dev,
if (!sock)
return -ESHUTDOWN;

- rt = ip_route_output_tunnel(skb, dev, bareudp->net, &saddr, info,
- IPPROTO_UDP, use_cache);
+ rt = udp_tunnel_dst_lookup(skb, dev, bareudp->net, &saddr, info,
+ IPPROTO_UDP, use_cache);

if (IS_ERR(rt))
return PTR_ERR(rt);
@@ -483,8 +483,8 @@ static int bareudp_fill_metadata_dst(struct net_device *dev,
struct rtable *rt;
__be32 saddr;

- rt = ip_route_output_tunnel(skb, dev, bareudp->net, &saddr,
- info, IPPROTO_UDP, use_cache);
+ rt = udp_tunnel_dst_lookup(skb, dev, bareudp->net, &saddr,
+ info, IPPROTO_UDP, use_cache);
if (IS_ERR(rt))
return PTR_ERR(rt);

diff --git a/include/net/route.h b/include/net/route.h
index 5c248a8e3d0e..980ab474eabd 100644
--- a/include/net/route.h
+++ b/include/net/route.h
@@ -136,12 +136,6 @@ static inline struct rtable *__ip_route_output_key(struct net *net,

struct rtable *ip_route_output_flow(struct net *, struct flowi4 *flp,
const struct sock *sk);
-struct rtable *ip_route_output_tunnel(struct sk_buff *skb,
- struct net_device *dev,
- struct net *net, __be32 *saddr,
- const struct ip_tunnel_info *info,
- u8 protocol, bool use_cache);
-
struct dst_entry *ipv4_blackhole_route(struct net *net,
struct dst_entry *dst_orig);

diff --git a/include/net/udp_tunnel.h b/include/net/udp_tunnel.h
index 21ba0a25f936..11e810ca5088 100644
--- a/include/net/udp_tunnel.h
+++ b/include/net/udp_tunnel.h
@@ -162,6 +162,12 @@ int udp_tunnel6_xmit_skb(struct dst_entry *dst, struct sock *sk,

void udp_tunnel_sock_release(struct socket *sock);

+struct rtable *udp_tunnel_dst_lookup(struct sk_buff *skb,
+ struct net_device *dev,
+ struct net *net, __be32 *saddr,
+ const struct ip_tunnel_info *info,
+ u8 protocol, bool use_cache);
+
struct metadata_dst *udp_tun_rx_dst(struct sk_buff *skb, unsigned short family,
__be16 flags, __be64 tunnel_id,
int md_size);
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index e2bf4602b559..3290a4442b4a 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -2885,54 +2885,6 @@ struct rtable *ip_route_output_flow(struct net *net, struct flowi4 *flp4,
}
EXPORT_SYMBOL_GPL(ip_route_output_flow);

-struct rtable *ip_route_output_tunnel(struct sk_buff *skb,
- struct net_device *dev,
- struct net *net, __be32 *saddr,
- const struct ip_tunnel_info *info,
- u8 protocol, bool use_cache)
-{
-#ifdef CONFIG_DST_CACHE
- struct dst_cache *dst_cache;
-#endif
- struct rtable *rt = NULL;
- struct flowi4 fl4;
- __u8 tos;
-
-#ifdef CONFIG_DST_CACHE
- dst_cache = (struct dst_cache *)&info->dst_cache;
- if (use_cache) {
- rt = dst_cache_get_ip4(dst_cache, saddr);
- if (rt)
- return rt;
- }
-#endif
- memset(&fl4, 0, sizeof(fl4));
- fl4.flowi4_mark = skb->mark;
- fl4.flowi4_proto = protocol;
- fl4.daddr = info->key.u.ipv4.dst;
- fl4.saddr = info->key.u.ipv4.src;
- tos = info->key.tos;
- fl4.flowi4_tos = RT_TOS(tos);
-
- rt = ip_route_output_key(net, &fl4);
- if (IS_ERR(rt)) {
- netdev_dbg(dev, "no route to %pI4\n", &fl4.daddr);
- return ERR_PTR(-ENETUNREACH);
- }
- if (rt->dst.dev == dev) { /* is this necessary? */
- netdev_dbg(dev, "circular route to %pI4\n", &fl4.daddr);
- ip_rt_put(rt);
- return ERR_PTR(-ELOOP);
- }
-#ifdef CONFIG_DST_CACHE
- if (use_cache)
- dst_cache_set_ip4(dst_cache, &rt->dst, fl4.saddr);
-#endif
- *saddr = fl4.saddr;
- return rt;
-}
-EXPORT_SYMBOL_GPL(ip_route_output_tunnel);
-
/* called with rcu_read_lock held */
static int rt_fill_info(struct net *net, __be32 dst, __be32 src,
struct rtable *rt, u32 table_id, struct flowi4 *fl4,
diff --git a/net/ipv4/udp_tunnel_core.c b/net/ipv4/udp_tunnel_core.c
index 1e7e4aecdc48..96f93f92b6ce 100644
--- a/net/ipv4/udp_tunnel_core.c
+++ b/net/ipv4/udp_tunnel_core.c
@@ -204,4 +204,52 @@ struct metadata_dst *udp_tun_rx_dst(struct sk_buff *skb, unsigned short family,
}
EXPORT_SYMBOL_GPL(udp_tun_rx_dst);

+struct rtable *udp_tunnel_dst_lookup(struct sk_buff *skb,
+ struct net_device *dev,
+ struct net *net, __be32 *saddr,
+ const struct ip_tunnel_info *info,
+ u8 protocol, bool use_cache)
+{
+#ifdef CONFIG_DST_CACHE
+ struct dst_cache *dst_cache;
+#endif
+ struct rtable *rt = NULL;
+ struct flowi4 fl4;
+ __u8 tos;
+
+#ifdef CONFIG_DST_CACHE
+ dst_cache = (struct dst_cache *)&info->dst_cache;
+ if (use_cache) {
+ rt = dst_cache_get_ip4(dst_cache, saddr);
+ if (rt)
+ return rt;
+ }
+#endif
+ memset(&fl4, 0, sizeof(fl4));
+ fl4.flowi4_mark = skb->mark;
+ fl4.flowi4_proto = protocol;
+ fl4.daddr = info->key.u.ipv4.dst;
+ fl4.saddr = info->key.u.ipv4.src;
+ tos = info->key.tos;
+ fl4.flowi4_tos = RT_TOS(tos);
+
+ rt = ip_route_output_key(net, &fl4);
+ if (IS_ERR(rt)) {
+ netdev_dbg(dev, "no route to %pI4\n", &fl4.daddr);
+ return ERR_PTR(-ENETUNREACH);
+ }
+ if (rt->dst.dev == dev) { /* is this necessary? */
+ netdev_dbg(dev, "circular route to %pI4\n", &fl4.daddr);
+ ip_rt_put(rt);
+ return ERR_PTR(-ELOOP);
+ }
+#ifdef CONFIG_DST_CACHE
+ if (use_cache)
+ dst_cache_set_ip4(dst_cache, &rt->dst, fl4.saddr);
+#endif
+ *saddr = fl4.saddr;
+ return rt;
+}
+EXPORT_SYMBOL_GPL(udp_tunnel_dst_lookup);
+
MODULE_LICENSE("GPL");
--
2.40.1

2023-10-16 07:16:24

by Beniamino Galvani

[permalink] [raw]
Subject: [PATCH net-next v2 6/7] geneve: use generic function for tunnel IPv4 route lookup

The route lookup can be done now via generic function
udp_tunnel_dst_lookup() to replace the custom implementation in
geneve_get_v4_rt().

Suggested-by: Guillaume Nault <[email protected]>
Signed-off-by: Beniamino Galvani <[email protected]>
---
drivers/net/geneve.c | 98 +++++++++++++++-----------------------------
1 file changed, 32 insertions(+), 66 deletions(-)

diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c
index 572c3e36b209..23041eeec121 100644
--- a/drivers/net/geneve.c
+++ b/drivers/net/geneve.c
@@ -800,58 +800,6 @@ static u8 geneve_get_dsfield(struct sk_buff *skb, struct net_device *dev,
return dsfield;
}

-static struct rtable *geneve_get_v4_rt(struct sk_buff *skb,
- struct net_device *dev,
- struct geneve_sock *gs4,
- struct flowi4 *fl4,
- const struct ip_tunnel_info *info,
- __be16 dport, __be16 sport,
- __u8 *full_tos)
-{
- bool use_cache = ip_tunnel_dst_cache_usable(skb, info);
- struct geneve_dev *geneve = netdev_priv(dev);
- struct dst_cache *dst_cache;
- struct rtable *rt = NULL;
- __u8 tos;
-
- if (!gs4)
- return ERR_PTR(-EIO);
-
- memset(fl4, 0, sizeof(*fl4));
- fl4->flowi4_mark = skb->mark;
- fl4->flowi4_proto = IPPROTO_UDP;
- fl4->daddr = info->key.u.ipv4.dst;
- fl4->saddr = info->key.u.ipv4.src;
- fl4->fl4_dport = dport;
- fl4->fl4_sport = sport;
- fl4->flowi4_flags = info->key.flow_flags;
-
- tos = geneve_get_dsfield(skb, dev, info, &use_cache);
- fl4->flowi4_tos = RT_TOS(tos);
- if (full_tos)
- *full_tos = tos;
-
- dst_cache = (struct dst_cache *)&info->dst_cache;
- if (use_cache) {
- rt = dst_cache_get_ip4(dst_cache, &fl4->saddr);
- if (rt)
- return rt;
- }
- rt = ip_route_output_key(geneve->net, fl4);
- if (IS_ERR(rt)) {
- netdev_dbg(dev, "no route to %pI4\n", &fl4->daddr);
- return ERR_PTR(-ENETUNREACH);
- }
- if (rt->dst.dev == dev) { /* is this necessary? */
- netdev_dbg(dev, "circular route to %pI4\n", &fl4->daddr);
- ip_rt_put(rt);
- return ERR_PTR(-ELOOP);
- }
- if (use_cache)
- dst_cache_set_ip4(dst_cache, &rt->dst, fl4->saddr);
- return rt;
-}
-
#if IS_ENABLED(CONFIG_IPV6)
static struct dst_entry *geneve_get_v6_dst(struct sk_buff *skb,
struct net_device *dev,
@@ -911,19 +859,28 @@ static int geneve_xmit_skb(struct sk_buff *skb, struct net_device *dev,
struct geneve_sock *gs4 = rcu_dereference(geneve->sock4);
const struct ip_tunnel_key *key = &info->key;
struct rtable *rt;
- struct flowi4 fl4;
- __u8 full_tos;
+ bool use_cache;
__u8 tos, ttl;
__be16 df = 0;
+ __be32 saddr;
__be16 sport;
int err;

if (!pskb_inet_may_pull(skb))
return -EINVAL;

+ if (!gs4)
+ return -EIO;
+
+ use_cache = ip_tunnel_dst_cache_usable(skb, info);
+ tos = geneve_get_dsfield(skb, dev, info, &use_cache);
sport = udp_flow_src_port(geneve->net, skb, 1, USHRT_MAX, true);
- rt = geneve_get_v4_rt(skb, dev, gs4, &fl4, info,
- geneve->cfg.info.key.tp_dst, sport, &full_tos);
+
+ rt = udp_tunnel_dst_lookup(skb, dev, geneve->net, 0, &saddr,
+ &info->key,
+ sport, geneve->cfg.info.key.tp_dst, tos,
+ use_cache ?
+ (struct dst_cache *)&info->dst_cache : NULL);
if (IS_ERR(rt))
return PTR_ERR(rt);

@@ -946,8 +903,8 @@ static int geneve_xmit_skb(struct sk_buff *skb, struct net_device *dev,
return -ENOMEM;
}

- unclone->key.u.ipv4.dst = fl4.saddr;
- unclone->key.u.ipv4.src = fl4.daddr;
+ unclone->key.u.ipv4.dst = saddr;
+ unclone->key.u.ipv4.src = info->key.u.ipv4.dst;
}

if (!pskb_may_pull(skb, ETH_HLEN)) {
@@ -961,13 +918,12 @@ static int geneve_xmit_skb(struct sk_buff *skb, struct net_device *dev,
return -EMSGSIZE;
}

+ tos = ip_tunnel_ecn_encap(tos, ip_hdr(skb), skb);
if (geneve->cfg.collect_md) {
- tos = ip_tunnel_ecn_encap(key->tos, ip_hdr(skb), skb);
ttl = key->ttl;

df = key->tun_flags & TUNNEL_DONT_FRAGMENT ? htons(IP_DF) : 0;
} else {
- tos = ip_tunnel_ecn_encap(full_tos, ip_hdr(skb), skb);
if (geneve->cfg.ttl_inherit)
ttl = ip_tunnel_get_ttl(ip_hdr(skb), skb);
else
@@ -995,7 +951,7 @@ static int geneve_xmit_skb(struct sk_buff *skb, struct net_device *dev,
if (unlikely(err))
return err;

- udp_tunnel_xmit_skb(rt, gs4->sock->sk, skb, fl4.saddr, fl4.daddr,
+ udp_tunnel_xmit_skb(rt, gs4->sock->sk, skb, saddr, info->key.u.ipv4.dst,
tos, ttl, df, sport, geneve->cfg.info.key.tp_dst,
!net_eq(geneve->net, dev_net(geneve->dev)),
!(info->key.tun_flags & TUNNEL_CSUM));
@@ -1144,19 +1100,29 @@ static int geneve_fill_metadata_dst(struct net_device *dev, struct sk_buff *skb)

if (ip_tunnel_info_af(info) == AF_INET) {
struct rtable *rt;
- struct flowi4 fl4;
-
struct geneve_sock *gs4 = rcu_dereference(geneve->sock4);
+ bool use_cache;
+ __be32 saddr;
+ u8 tos;
+
+ if (!gs4)
+ return -EIO;
+
+ use_cache = ip_tunnel_dst_cache_usable(skb, info);
+ tos = geneve_get_dsfield(skb, dev, info, &use_cache);
sport = udp_flow_src_port(geneve->net, skb,
1, USHRT_MAX, true);

- rt = geneve_get_v4_rt(skb, dev, gs4, &fl4, info,
- geneve->cfg.info.key.tp_dst, sport, NULL);
+ rt = udp_tunnel_dst_lookup(skb, dev, geneve->net, 0, &saddr,
+ &info->key,
+ sport, geneve->cfg.info.key.tp_dst,
+ tos,
+ use_cache ? &info->dst_cache : NULL);
if (IS_ERR(rt))
return PTR_ERR(rt);

ip_rt_put(rt);
- info->key.u.ipv4.src = fl4.saddr;
+ info->key.u.ipv4.src = saddr;
#if IS_ENABLED(CONFIG_IPV6)
} else if (ip_tunnel_info_af(info) == AF_INET6) {
struct dst_entry *dst;
--
2.40.1

2023-10-16 07:16:31

by Beniamino Galvani

[permalink] [raw]
Subject: [PATCH net-next v2 5/7] geneve: add dsfield helper function

Add a helper function to compute the tos/dsfield. In this way, we can
factor out some duplicate code. Also, the helper will be called from
more places in the next commit.

Suggested-by: Guillaume Nault <[email protected]>
Signed-off-by: Beniamino Galvani <[email protected]>
---
drivers/net/geneve.c | 29 ++++++++++++++++++-----------
1 file changed, 18 insertions(+), 11 deletions(-)

diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c
index 78f9d588f712..572c3e36b209 100644
--- a/drivers/net/geneve.c
+++ b/drivers/net/geneve.c
@@ -784,6 +784,22 @@ static int geneve_build_skb(struct dst_entry *dst, struct sk_buff *skb,
return err;
}

+static u8 geneve_get_dsfield(struct sk_buff *skb, struct net_device *dev,
+ const struct ip_tunnel_info *info,
+ bool *use_cache)
+{
+ struct geneve_dev *geneve = netdev_priv(dev);
+ u8 dsfield;
+
+ dsfield = info->key.tos;
+ if (dsfield == 1 && !geneve->cfg.collect_md) {
+ dsfield = ip_tunnel_get_dsfield(ip_hdr(skb), skb);
+ *use_cache = false;
+ }
+
+ return dsfield;
+}
+
static struct rtable *geneve_get_v4_rt(struct sk_buff *skb,
struct net_device *dev,
struct geneve_sock *gs4,
@@ -810,11 +826,7 @@ static struct rtable *geneve_get_v4_rt(struct sk_buff *skb,
fl4->fl4_sport = sport;
fl4->flowi4_flags = info->key.flow_flags;

- tos = info->key.tos;
- if ((tos == 1) && !geneve->cfg.collect_md) {
- tos = ip_tunnel_get_dsfield(ip_hdr(skb), skb);
- use_cache = false;
- }
+ tos = geneve_get_dsfield(skb, dev, info, &use_cache);
fl4->flowi4_tos = RT_TOS(tos);
if (full_tos)
*full_tos = tos;
@@ -865,12 +877,7 @@ static struct dst_entry *geneve_get_v6_dst(struct sk_buff *skb,
fl6->fl6_dport = dport;
fl6->fl6_sport = sport;

- prio = info->key.tos;
- if ((prio == 1) && !geneve->cfg.collect_md) {
- prio = ip_tunnel_get_dsfield(ip_hdr(skb), skb);
- use_cache = false;
- }
-
+ prio = geneve_get_dsfield(skb, dev, info, &use_cache);
fl6->flowlabel = ip6_make_flowinfo(prio, info->key.label);
dst_cache = (struct dst_cache *)&info->dst_cache;
if (use_cache) {
--
2.40.1

2023-10-16 07:16:34

by Beniamino Galvani

[permalink] [raw]
Subject: [PATCH net-next v2 7/7] vxlan: use generic function for tunnel IPv4 route lookup

The route lookup can be done now via generic function
udp_tunnel_dst_lookup() to replace the custom implementations in
vxlan_get_route().

Note that this patch only touches IPv4, while IPv6 still uses
vxlan6_get_route(). After IPv6 route lookup gets converted as well,
vxlan_xmit_one() can be simplified by removing local variables that
will be passed via "struct ip_tunnel_key", such as remote_ip,
local_ip, flow_flags, label.

Suggested-by: Guillaume Nault <[email protected]>
Signed-off-by: Beniamino Galvani <[email protected]>
---
drivers/net/vxlan/vxlan_core.c | 114 ++++++++++++---------------------
1 file changed, 41 insertions(+), 73 deletions(-)

diff --git a/drivers/net/vxlan/vxlan_core.c b/drivers/net/vxlan/vxlan_core.c
index ece377b1b6bd..6f7d45e3cfa2 100644
--- a/drivers/net/vxlan/vxlan_core.c
+++ b/drivers/net/vxlan/vxlan_core.c
@@ -2215,57 +2215,6 @@ static int vxlan_build_skb(struct sk_buff *skb, struct dst_entry *dst,
return 0;
}

-static struct rtable *vxlan_get_route(struct vxlan_dev *vxlan, struct net_device *dev,
- struct vxlan_sock *sock4,
- struct sk_buff *skb, int oif, u8 tos,
- __be32 daddr, __be32 *saddr, __be16 dport, __be16 sport,
- __u8 flow_flags, struct dst_cache *dst_cache,
- const struct ip_tunnel_info *info)
-{
- bool use_cache = ip_tunnel_dst_cache_usable(skb, info);
- struct rtable *rt = NULL;
- struct flowi4 fl4;
-
- if (!sock4)
- return ERR_PTR(-EIO);
-
- if (tos && !info)
- use_cache = false;
- if (use_cache) {
- rt = dst_cache_get_ip4(dst_cache, saddr);
- if (rt)
- return rt;
- }
-
- memset(&fl4, 0, sizeof(fl4));
- fl4.flowi4_oif = oif;
- fl4.flowi4_tos = RT_TOS(tos);
- fl4.flowi4_mark = skb->mark;
- fl4.flowi4_proto = IPPROTO_UDP;
- fl4.daddr = daddr;
- fl4.saddr = *saddr;
- fl4.fl4_dport = dport;
- fl4.fl4_sport = sport;
- fl4.flowi4_flags = flow_flags;
-
- rt = ip_route_output_key(vxlan->net, &fl4);
- if (!IS_ERR(rt)) {
- if (rt->dst.dev == dev) {
- netdev_dbg(dev, "circular route to %pI4\n", &daddr);
- ip_rt_put(rt);
- return ERR_PTR(-ELOOP);
- }
-
- *saddr = fl4.saddr;
- if (use_cache)
- dst_cache_set_ip4(dst_cache, &rt->dst, fl4.saddr);
- } else {
- netdev_dbg(dev, "no route to %pI4\n", &daddr);
- return ERR_PTR(-ENETUNREACH);
- }
- return rt;
-}
-
#if IS_ENABLED(CONFIG_IPV6)
static struct dst_entry *vxlan6_get_route(struct vxlan_dev *vxlan,
struct net_device *dev,
@@ -2418,30 +2367,38 @@ void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
{
struct dst_cache *dst_cache;
struct ip_tunnel_info *info;
+ struct ip_tunnel_key *pkey;
+ struct ip_tunnel_key key;
struct vxlan_dev *vxlan = netdev_priv(dev);
const struct iphdr *old_iph = ip_hdr(skb);
union vxlan_addr *dst;
- union vxlan_addr remote_ip, local_ip;
+ union vxlan_addr remote_ip;
struct vxlan_metadata _md;
struct vxlan_metadata *md = &_md;
unsigned int pkt_len = skb->len;
__be16 src_port = 0, dst_port;
struct dst_entry *ndst = NULL;
- __u8 tos, ttl, flow_flags = 0;
+ __u8 tos, ttl;
int ifindex;
int err;
u32 flags = vxlan->cfg.flags;
+ bool use_cache;
bool udp_sum = false;
bool xnet = !net_eq(vxlan->net, dev_net(vxlan->dev));
__be32 vni = 0;
#if IS_ENABLED(CONFIG_IPV6)
+ union vxlan_addr local_ip;
__be32 label;
#endif

info = skb_tunnel_info(skb);
+ use_cache = ip_tunnel_dst_cache_usable(skb, info);

if (rdst) {
dst = &rdst->remote_ip;
+ memset(&key, 0, sizeof(key));
+ pkey = &key;
+
if (vxlan_addr_any(dst)) {
if (did_rsc) {
/* short-circuited back to local bridge */
@@ -2455,7 +2412,15 @@ void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
dst_port = rdst->remote_port ? rdst->remote_port : vxlan->cfg.dst_port;
vni = (rdst->remote_vni) ? : default_vni;
ifindex = rdst->remote_ifindex;
- local_ip = vxlan->cfg.saddr;
+
+ if (dst->sa.sa_family == AF_INET) {
+ key.u.ipv4.src = vxlan->cfg.saddr.sin.sin_addr.s_addr;
+ key.u.ipv4.dst = rdst->remote_ip.sin.sin_addr.s_addr;
+ } else {
+ key.u.ipv6.src = vxlan->cfg.saddr.sin6.sin6_addr;
+ key.u.ipv6.dst = rdst->remote_ip.sin6.sin6_addr;
+ }
+
dst_cache = &rdst->dst_cache;
md->gbp = skb->mark;
if (flags & VXLAN_F_TTL_INHERIT) {
@@ -2469,12 +2434,15 @@ void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
tos = vxlan->cfg.tos;
if (tos == 1)
tos = ip_tunnel_get_dsfield(old_iph, skb);
+ if (tos && !info)
+ use_cache = false;

if (dst->sa.sa_family == AF_INET)
udp_sum = !(flags & VXLAN_F_UDP_ZERO_CSUM_TX);
else
udp_sum = !(flags & VXLAN_F_UDP_ZERO_CSUM6_TX);
#if IS_ENABLED(CONFIG_IPV6)
+ local_ip = vxlan->cfg.saddr;
label = vxlan->cfg.label;
#endif
} else {
@@ -2486,14 +2454,15 @@ void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
remote_ip.sa.sa_family = ip_tunnel_info_af(info);
if (remote_ip.sa.sa_family == AF_INET) {
remote_ip.sin.sin_addr.s_addr = info->key.u.ipv4.dst;
- local_ip.sin.sin_addr.s_addr = info->key.u.ipv4.src;
} else {
remote_ip.sin6.sin6_addr = info->key.u.ipv6.dst;
+#if IS_ENABLED(CONFIG_IPV6)
local_ip.sin6.sin6_addr = info->key.u.ipv6.src;
+#endif
}
dst = &remote_ip;
+ pkey = &info->key;
dst_port = info->key.tp_dst ? : vxlan->cfg.dst_port;
- flow_flags = info->key.flow_flags;
vni = tunnel_id_to_key32(info->key.tun_id);
ifindex = 0;
dst_cache = &info->dst_cache;
@@ -2517,15 +2486,14 @@ void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
struct vxlan_sock *sock4 = rcu_dereference(vxlan->vn4_sock);
struct rtable *rt;
__be16 df = 0;
+ __be32 saddr;

if (!ifindex)
ifindex = sock4->sock->sk->sk_bound_dev_if;

- rt = vxlan_get_route(vxlan, dev, sock4, skb, ifindex, tos,
- dst->sin.sin_addr.s_addr,
- &local_ip.sin.sin_addr.s_addr,
- dst_port, src_port, flow_flags,
- dst_cache, info);
+ rt = udp_tunnel_dst_lookup(skb, dev, vxlan->net, ifindex,
+ &saddr, pkey, src_port, dst_port,
+ tos, use_cache ? dst_cache : NULL);
if (IS_ERR(rt)) {
err = PTR_ERR(rt);
goto tx_error;
@@ -2561,16 +2529,13 @@ void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
} else if (err) {
if (info) {
struct ip_tunnel_info *unclone;
- struct in_addr src, dst;

unclone = skb_tunnel_info_unclone(skb);
if (unlikely(!unclone))
goto tx_error;

- src = remote_ip.sin.sin_addr;
- dst = local_ip.sin.sin_addr;
- unclone->key.u.ipv4.src = src.s_addr;
- unclone->key.u.ipv4.dst = dst.s_addr;
+ unclone->key.u.ipv4.src = pkey->u.ipv4.dst;
+ unclone->key.u.ipv4.dst = saddr;
}
vxlan_encap_bypass(skb, vxlan, vxlan, vni, false);
dst_release(ndst);
@@ -2584,8 +2549,8 @@ void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
if (err < 0)
goto tx_error;

- udp_tunnel_xmit_skb(rt, sock4->sock->sk, skb, local_ip.sin.sin_addr.s_addr,
- dst->sin.sin_addr.s_addr, tos, ttl, df,
+ udp_tunnel_xmit_skb(rt, sock4->sock->sk, skb, saddr,
+ pkey->u.ipv4.dst, tos, ttl, df,
src_port, dst_port, xnet, !udp_sum);
#if IS_ENABLED(CONFIG_IPV6)
} else {
@@ -3286,11 +3251,14 @@ static int vxlan_fill_metadata_dst(struct net_device *dev, struct sk_buff *skb)
struct vxlan_sock *sock4 = rcu_dereference(vxlan->vn4_sock);
struct rtable *rt;

- rt = vxlan_get_route(vxlan, dev, sock4, skb, 0, info->key.tos,
- info->key.u.ipv4.dst,
- &info->key.u.ipv4.src, dport, sport,
- info->key.flow_flags, &info->dst_cache,
- info);
+ if (!sock4)
+ return -EIO;
+
+ rt = udp_tunnel_dst_lookup(skb, dev, vxlan->net, 0,
+ &info->key.u.ipv4.src,
+ &info->key,
+ sport, dport, info->key.tos,
+ &info->dst_cache);
if (IS_ERR(rt))
return PTR_ERR(rt);
ip_rt_put(rt);
--
2.40.1

2023-10-16 07:16:53

by Beniamino Galvani

[permalink] [raw]
Subject: [PATCH net-next v2 4/7] ipv4: use tunnel flow flags for tunnel route lookups

Commit 451ef36bd229 ("ip_tunnels: Add new flow flags field to
ip_tunnel_key") added a new field to struct ip_tunnel_key to control
route lookups. Currently the flag is used by vxlan and geneve tunnels;
use it also in udp_tunnel_dst_lookup() so that it affects all tunnel
types relying on this function.

Signed-off-by: Beniamino Galvani <[email protected]>
Reviewed-by: David Ahern <[email protected]>
---
net/ipv4/udp_tunnel_core.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/net/ipv4/udp_tunnel_core.c b/net/ipv4/udp_tunnel_core.c
index 494685e82856..a87defb2b167 100644
--- a/net/ipv4/udp_tunnel_core.c
+++ b/net/ipv4/udp_tunnel_core.c
@@ -232,6 +232,7 @@ struct rtable *udp_tunnel_dst_lookup(struct sk_buff *skb,
fl4.fl4_dport = dport;
fl4.fl4_sport = sport;
fl4.flowi4_tos = RT_TOS(tos);
+ fl4.flowi4_flags = key->flow_flags;

rt = ip_route_output_key(net, &fl4);
if (IS_ERR(rt)) {
--
2.40.1

2023-10-16 09:01:58

by patchwork-bot+netdevbpf

[permalink] [raw]
Subject: Re: [PATCH net-next v2 0/7] net: consolidate IPv4 route lookup for UDP tunnels

Hello:

This series was applied to netdev/net-next.git (main)
by David S. Miller <[email protected]>:

On Mon, 16 Oct 2023 09:15:19 +0200 you wrote:
> At the moment different UDP tunnels rely on different functions for
> IPv4 route lookup, and those functions all implement the same
> logic. Only bareudp uses the generic ip_route_output_tunnel(), while
> geneve and vxlan basically duplicate it slightly differently.
>
> This series first extends the generic lookup function so that it is
> suitable for all UDP tunnel implementations. Then, bareudp, geneve and
> vxlan are adapted to use them.
>
> [...]

Here is the summary with links:
- [net-next,v2,1/7] ipv4: rename and move ip_route_output_tunnel()
https://git.kernel.org/netdev/net-next/c/bf3fcbf7e7a0
- [net-next,v2,2/7] ipv4: remove "proto" argument from udp_tunnel_dst_lookup()
https://git.kernel.org/netdev/net-next/c/78f3655adcb5
- [net-next,v2,3/7] ipv4: add new arguments to udp_tunnel_dst_lookup()
https://git.kernel.org/netdev/net-next/c/72fc68c6356b
- [net-next,v2,4/7] ipv4: use tunnel flow flags for tunnel route lookups
https://git.kernel.org/netdev/net-next/c/3ae983a603a4
- [net-next,v2,5/7] geneve: add dsfield helper function
https://git.kernel.org/netdev/net-next/c/60a77d11cd5d
- [net-next,v2,6/7] geneve: use generic function for tunnel IPv4 route lookup
https://git.kernel.org/netdev/net-next/c/daa2ba7ed1d1
- [net-next,v2,7/7] vxlan: use generic function for tunnel IPv4 route lookup
https://git.kernel.org/netdev/net-next/c/6f19b2c136d9

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html