Add support for nexthop routes for Marvell Prestera driver.
Subscribe on NEIGH_UPDATE events.
Add features:
- Support connected route adding
e.g.: "ip address add 1.1.1.1/24 dev sw1p1"
e.g.: "ip route add 6.6.6/24 dev sw1p1"
- Support nexthop route adding
e.g.: "ip route add 5.5.5/24 via 1.1.1.2"
- Support ECMP route adding
e.g.: "ip route add 5.5.5/24 nexthop via 1.1.1.2 nexthop via 1.1.1.3"
- Support "offload" and "trap" flags per each nexthop
- Support "offload" flag for neighbours
Limitations:
- Only "local" and "main" tables supported
- Only generic interfaces supported for router (no bridges or vlans)
Flags meaning:
ip route add 5.5.5/24 nexthop via 2.2.2.2 nexthop via 2.2.2.3
ip route show
...
5.5.5.0/24 rt_offload
nexthop via 2.2.2.2 dev sw1p31 weight 1 trap
nexthop via 2.2.2.3 dev sw1p31 weight 1 trap
...
# When you just add route - lpm entry became occupied
# in HW ("rt_offload" flag), but related to nexthops neighbours
# still not resolved ("trap" flag).
#
# After some time...
ip route show
...
5.5.5.0/24 rt_offload
nexthop via 2.2.2.2 dev sw1p31 weight 1 offload
nexthop via 2.2.2.3 dev sw1p31 weight 1 offload
...
# You will see, that appropriate neighbours was resolved and nexthop
# entries occupied in HW too ("offload" flag)
Co-developed-by: Taras Chornyi <[email protected]>
Signed-off-by: Taras Chornyi <[email protected]>
Co-developed-by: Oleksandr Mazur <[email protected]>
Signed-off-by: Oleksandr Mazur <[email protected]>
Signed-off-by: Yevhen Orlov <[email protected]>
Changes for v2:
* Add more reviewers in CC
* Check if route nexthop or direct with fib_nh_gw_family instead of fib_nh_scope
This is needed after,
747c14307214 ("ip: fix dflt addr selection for connected nexthop"),
because direct route is now with the same scope as nexthop (RT_SCOPE_LINK)
Changes for v3:
* Resolve "unused functions" warnings, after
patch ("net: marvell: prestera: Add heplers to interact ... "), and before
patch ("net: marvell: prestera: Add neighbour cache accounting")
Changes for v4:
* Rebase to the latest master to resolve patch applying issues
Changes for v5:
* Repack structures to prevent holes
* Remove unused variables
* Fix misspeling issues
Changes for v6:
* Rebase on top of master
* Fix smatch warnings
Changes for v7:
* Rebase on top of master
* Refactor: use "fib_lookup" instead of "fib_new_table"+"fib_table_lookup",
according to Paolo Abeni suggestion
* Refactor: use "rhashtable_free_and_destroy" instead of rhashtable
walk, according to Paolo Abeni suggestion
Yevhen Orlov (9):
net: marvell: prestera: Add router nexthops ABI
net: marvell: prestera: Add cleanup of allocated fib_nodes
net: marvell: prestera: Add strict cleanup of fib arbiter
net: marvell: prestera: add delayed wq and flush wq on deinit
net: marvell: prestera: Add length macros for prestera_ip_addr
net: marvell: prestera: Add heplers to interact with fib_notifier_info
net: marvell: prestera: add stub handler neighbour events
net: marvell: prestera: Add neighbour cache accounting
net: marvell: prestera: Propagate nh state from hw to kernel
.../net/ethernet/marvell/prestera/prestera.h | 12 +
.../ethernet/marvell/prestera/prestera_hw.c | 130 ++
.../ethernet/marvell/prestera/prestera_hw.h | 11 +
.../ethernet/marvell/prestera/prestera_main.c | 11 +
.../marvell/prestera/prestera_router.c | 1119 ++++++++++++++++-
.../marvell/prestera/prestera_router_hw.c | 366 +++++-
.../marvell/prestera/prestera_router_hw.h | 76 +-
7 files changed, 1683 insertions(+), 42 deletions(-)
--
2.17.1
We poll nexthops in HW and call for each active nexthop appropriate
neighbour.
Also we provide implicity neighbour resolving.
For example, user have added nexthop route:
# ip route add 5.5.5.5 via 1.1.1.2
But neighbour 1.1.1.2 doesn't exist. In this case we will try to call
neigh_event_send, even if there is no traffic.
This is useful, when you have add route, which will be used after some
time but with a lot of traffic (burst). So, we has prepared, offloaded
route in advance.
Co-developed-by: Taras Chornyi <[email protected]>
Signed-off-by: Taras Chornyi <[email protected]>
Co-developed-by: Oleksandr Mazur <[email protected]>
Signed-off-by: Oleksandr Mazur <[email protected]>
Signed-off-by: Yevhen Orlov <[email protected]>
---
.../net/ethernet/marvell/prestera/prestera.h | 3 +
.../marvell/prestera/prestera_router.c | 111 ++++++++++++++++++
2 files changed, 114 insertions(+)
diff --git a/drivers/net/ethernet/marvell/prestera/prestera.h b/drivers/net/ethernet/marvell/prestera/prestera.h
index 540a36069b79..35554ee805cd 100644
--- a/drivers/net/ethernet/marvell/prestera/prestera.h
+++ b/drivers/net/ethernet/marvell/prestera/prestera.h
@@ -324,6 +324,9 @@ struct prestera_router {
struct notifier_block netevent_nb;
u8 *nhgrp_hw_state_cache; /* Bitmap cached hw state of nhs */
unsigned long nhgrp_hw_cache_kick; /* jiffies */
+ struct {
+ struct delayed_work dw;
+ } neighs_update;
};
struct prestera_rxtx_params {
diff --git a/drivers/net/ethernet/marvell/prestera/prestera_router.c b/drivers/net/ethernet/marvell/prestera/prestera_router.c
index af7d24390d2e..4046be0e86ff 100644
--- a/drivers/net/ethernet/marvell/prestera/prestera_router.c
+++ b/drivers/net/ethernet/marvell/prestera/prestera_router.c
@@ -16,6 +16,9 @@
#include "prestera.h"
#include "prestera_router_hw.h"
+#define PRESTERA_IMPLICITY_RESOLVE_DEAD_NEIGH
+#define PRESTERA_NH_PROBE_INTERVAL 5000 /* ms */
+
struct prestera_kern_neigh_cache_key {
struct prestera_ip_addr addr;
struct net_device *dev;
@@ -32,6 +35,7 @@ struct prestera_kern_neigh_cache {
/* Lock cache if neigh is present in kernel */
bool in_kernel;
};
+
struct prestera_kern_fib_cache_key {
struct prestera_ip_addr addr;
u32 prefix_len;
@@ -1021,6 +1025,78 @@ __prestera_k_arb_util_fib_overlapped(struct prestera_switch *sw,
return rfc;
}
+static void __prestera_k_arb_hw_state_upd(struct prestera_switch *sw,
+ struct prestera_kern_neigh_cache *nc)
+{
+ struct prestera_nh_neigh_key nh_key;
+ struct prestera_nh_neigh *nh_neigh;
+ struct neighbour *n;
+ bool hw_active;
+
+ prestera_util_nc_key2nh_key(&nc->key, &nh_key);
+ nh_neigh = prestera_nh_neigh_find(sw, &nh_key);
+ if (!nh_neigh) {
+ pr_err("Cannot find nh_neigh for cached %pI4n",
+ &nc->key.addr.u.ipv4);
+ return;
+ }
+
+ hw_active = prestera_nh_neigh_util_hw_state(sw, nh_neigh);
+
+#ifdef PRESTERA_IMPLICITY_RESOLVE_DEAD_NEIGH
+ if (!hw_active && nc->in_kernel)
+ goto out;
+#else /* PRESTERA_IMPLICITY_RESOLVE_DEAD_NEIGH */
+ if (!hw_active)
+ goto out;
+#endif /* PRESTERA_IMPLICITY_RESOLVE_DEAD_NEIGH */
+
+ if (nc->key.addr.v == PRESTERA_IPV4) {
+ n = neigh_lookup(&arp_tbl, &nc->key.addr.u.ipv4,
+ nc->key.dev);
+ if (!n)
+ n = neigh_create(&arp_tbl, &nc->key.addr.u.ipv4,
+ nc->key.dev);
+ } else {
+ n = NULL;
+ }
+
+ if (!IS_ERR(n) && n) {
+ neigh_event_send(n, NULL);
+ neigh_release(n);
+ } else {
+ pr_err("Cannot create neighbour %pI4n", &nc->key.addr.u.ipv4);
+ }
+
+out:
+ return;
+}
+
+/* Propagate hw state to kernel */
+static void prestera_k_arb_hw_evt(struct prestera_switch *sw)
+{
+ struct prestera_kern_neigh_cache *n_cache;
+ struct rhashtable_iter iter;
+
+ rhashtable_walk_enter(&sw->router->kern_neigh_cache_ht, &iter);
+ rhashtable_walk_start(&iter);
+ while (1) {
+ n_cache = rhashtable_walk_next(&iter);
+
+ if (!n_cache)
+ break;
+
+ if (IS_ERR(n_cache))
+ continue;
+
+ rhashtable_walk_stop(&iter);
+ __prestera_k_arb_hw_state_upd(sw, n_cache);
+ rhashtable_walk_start(&iter);
+ }
+ rhashtable_walk_stop(&iter);
+ rhashtable_walk_exit(&iter);
+}
+
/* Propagate kernel event to hw */
static void prestera_k_arb_n_evt(struct prestera_switch *sw,
struct neighbour *n)
@@ -1441,6 +1517,34 @@ static int prestera_router_netevent_event(struct notifier_block *nb,
return NOTIFY_DONE;
}
+static void prestera_router_update_neighs_work(struct work_struct *work)
+{
+ struct prestera_router *router;
+
+ router = container_of(work, struct prestera_router,
+ neighs_update.dw.work);
+ rtnl_lock();
+
+ prestera_k_arb_hw_evt(router->sw);
+
+ rtnl_unlock();
+ prestera_queue_delayed_work(&router->neighs_update.dw,
+ msecs_to_jiffies(PRESTERA_NH_PROBE_INTERVAL));
+}
+
+static int prestera_neigh_work_init(struct prestera_switch *sw)
+{
+ INIT_DELAYED_WORK(&sw->router->neighs_update.dw,
+ prestera_router_update_neighs_work);
+ prestera_queue_delayed_work(&sw->router->neighs_update.dw, 0);
+ return 0;
+}
+
+static void prestera_neigh_work_fini(struct prestera_switch *sw)
+{
+ cancel_delayed_work_sync(&sw->router->neighs_update.dw);
+}
+
int prestera_router_init(struct prestera_switch *sw)
{
struct prestera_router *router;
@@ -1474,6 +1578,10 @@ int prestera_router_init(struct prestera_switch *sw)
goto err_nh_state_cache_alloc;
}
+ err = prestera_neigh_work_init(sw);
+ if (err)
+ goto err_neigh_work_init;
+
router->inetaddr_valid_nb.notifier_call = __prestera_inetaddr_valid_cb;
err = register_inetaddr_validator_notifier(&router->inetaddr_valid_nb);
if (err)
@@ -1504,6 +1612,8 @@ int prestera_router_init(struct prestera_switch *sw)
err_register_inetaddr_notifier:
unregister_inetaddr_validator_notifier(&router->inetaddr_valid_nb);
err_register_inetaddr_validator_notifier:
+ prestera_neigh_work_fini(sw);
+err_neigh_work_init:
kfree(router->nhgrp_hw_state_cache);
err_nh_state_cache_alloc:
rhashtable_destroy(&router->kern_neigh_cache_ht);
@@ -1522,6 +1632,7 @@ void prestera_router_fini(struct prestera_switch *sw)
unregister_netevent_notifier(&sw->router->netevent_nb);
unregister_inetaddr_notifier(&sw->router->inetaddr_nb);
unregister_inetaddr_validator_notifier(&sw->router->inetaddr_valid_nb);
+ prestera_neigh_work_fini(sw);
prestera_queue_drain();
prestera_k_arb_abort(sw);
--
2.17.1
This will, ensure, that there is no more, preciously allocated fib_cache
entries left after deinit.
Will be used to free allocated resources of nexthop routes, that points
to "not our" port (e.g. eth0).
Signed-off-by: Yevhen Orlov <[email protected]>
---
.../marvell/prestera/prestera_router.c | 42 ++++++++++++++++++-
1 file changed, 41 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/marvell/prestera/prestera_router.c b/drivers/net/ethernet/marvell/prestera/prestera_router.c
index a8548b9f9cf1..b4fd8276bbce 100644
--- a/drivers/net/ethernet/marvell/prestera/prestera_router.c
+++ b/drivers/net/ethernet/marvell/prestera/prestera_router.c
@@ -72,14 +72,21 @@ prestera_kern_fib_cache_find(struct prestera_switch *sw,
return fib_cache;
}
+static void
+__prestera_kern_fib_cache_destruct(struct prestera_switch *sw,
+ struct prestera_kern_fib_cache *fib_cache)
+{
+ fib_info_put(fib_cache->fi);
+}
+
static void
prestera_kern_fib_cache_destroy(struct prestera_switch *sw,
struct prestera_kern_fib_cache *fib_cache)
{
- fib_info_put(fib_cache->fi);
rhashtable_remove_fast(&sw->router->kern_fib_cache_ht,
&fib_cache->ht_node,
__prestera_kern_fib_cache_ht_params);
+ __prestera_kern_fib_cache_destruct(sw, fib_cache);
kfree(fib_cache);
}
@@ -336,6 +343,36 @@ prestera_k_arb_fib_evt(struct prestera_switch *sw,
return 0;
}
+static void __prestera_k_arb_abort_fib_ht_cb(void *ptr, void *arg)
+{
+ struct prestera_kern_fib_cache *fib_cache = ptr;
+ struct prestera_switch *sw = arg;
+
+ __prestera_k_arb_fib_lpm_offload_set(sw, fib_cache,
+ false, false,
+ false);
+ /* No need to destroy lpm.
+ * It will be aborted by destroy_ht
+ */
+ __prestera_kern_fib_cache_destruct(sw, fib_cache);
+ kfree(fib_cache);
+}
+
+static void prestera_k_arb_abort(struct prestera_switch *sw)
+{
+ /* Function to remove all arbiter entries and related hw objects. */
+ /* Sequence:
+ * 1) Clear arbiter tables, but don't touch hw
+ * 2) Clear hw
+ * We use such approach, because arbiter object is not directly mapped
+ * to hw. So deletion of one arbiter object may even lead to creation of
+ * hw object (e.g. in case of overlapped routes).
+ */
+ rhashtable_free_and_destroy(&sw->router->kern_fib_cache_ht,
+ __prestera_k_arb_abort_fib_ht_cb,
+ sw);
+}
+
static int __prestera_inetaddr_port_event(struct net_device *port_dev,
unsigned long event,
struct netlink_ext_ack *extack)
@@ -602,6 +639,9 @@ void prestera_router_fini(struct prestera_switch *sw)
unregister_fib_notifier(&init_net, &sw->router->fib_nb);
unregister_inetaddr_notifier(&sw->router->inetaddr_nb);
unregister_inetaddr_validator_notifier(&sw->router->inetaddr_valid_nb);
+
+ prestera_k_arb_abort(sw);
+
kfree(sw->router->nhgrp_hw_state_cache);
rhashtable_destroy(&sw->router->kern_fib_cache_ht);
prestera_router_hw_fini(sw);
--
2.17.1
Hello:
This series was applied to netdev/net-next.git (master)
by Jakub Kicinski <[email protected]>:
On Sat, 1 Oct 2022 12:34:08 +0300 you wrote:
> Add support for nexthop routes for Marvell Prestera driver.
> Subscribe on NEIGH_UPDATE events.
>
> Add features:
> - Support connected route adding
> e.g.: "ip address add 1.1.1.1/24 dev sw1p1"
> e.g.: "ip route add 6.6.6/24 dev sw1p1"
> - Support nexthop route adding
> e.g.: "ip route add 5.5.5/24 via 1.1.1.2"
> - Support ECMP route adding
> e.g.: "ip route add 5.5.5/24 nexthop via 1.1.1.2 nexthop via 1.1.1.3"
> - Support "offload" and "trap" flags per each nexthop
> - Support "offload" flag for neighbours
>
> [...]
Here is the summary with links:
- [net-next,v7,1/9] net: marvell: prestera: Add router nexthops ABI
https://git.kernel.org/netdev/net-next/c/0a23ae237171
- [net-next,v7,2/9] net: marvell: prestera: Add cleanup of allocated fib_nodes
https://git.kernel.org/netdev/net-next/c/1e7313e83ef7
- [net-next,v7,3/9] net: marvell: prestera: Add strict cleanup of fib arbiter
https://git.kernel.org/netdev/net-next/c/333fe4d033fa
- [net-next,v7,4/9] net: marvell: prestera: add delayed wq and flush wq on deinit
https://git.kernel.org/netdev/net-next/c/90b6f9c09851
- [net-next,v7,5/9] net: marvell: prestera: Add length macros for prestera_ip_addr
https://git.kernel.org/netdev/net-next/c/59b44ea8aa56
- [net-next,v7,6/9] net: marvell: prestera: Add heplers to interact with fib_notifier_info
https://git.kernel.org/netdev/net-next/c/04f24a1e6de6
- [net-next,v7,7/9] net: marvell: prestera: add stub handler neighbour events
https://git.kernel.org/netdev/net-next/c/8b1ef4911a41
- [net-next,v7,8/9] net: marvell: prestera: Add neighbour cache accounting
https://git.kernel.org/netdev/net-next/c/396b80cb5cc8
- [net-next,v7,9/9] net: marvell: prestera: Propagate nh state from hw to kernel
https://git.kernel.org/netdev/net-next/c/ae15ed6e40c9
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html