2019-07-25 20:44:06

by Brodie Greenfield

[permalink] [raw]
Subject: [PATCH 0/2] Make ipmr queue length configurable

We want to have some more space in our queue for processing incoming
multicast packets, so we can process more of them without dropping
them prematurely. It is useful to be able to increase this limit on
higher-spec platforms that can handle more items.

For the particular use case here at Allied Telesis, we have linux
running on our switches and routers, with support for the number of
multicast groups being increased. Basically, this queue length affects
the time taken to fully learn all of the multicast streams.

Changes in v3:
- Corrected a v4 to v6 typo.

Changes in v2:
- Tidy up a few unnecessary bits of code.
- Put the sysctl inside ip multicast ifdef.
- Included the IPv6 version.

Brodie Greenfield (2):
ipmr: Make cache queue length configurable
ip6mr: Make cache queue length configurable

Documentation/networking/ip-sysctl.txt | 16 ++++++++++++++++
include/net/netns/ipv4.h | 1 +
include/net/netns/ipv6.h | 1 +
net/ipv4/af_inet.c | 1 +
net/ipv4/ipmr.c | 4 +++-
net/ipv4/sysctl_net_ipv4.c | 7 +++++++
net/ipv6/af_inet6.c | 1 +
net/ipv6/ip6mr.c | 4 +++-
net/ipv6/sysctl_net_ipv6.c | 7 +++++++
9 files changed, 40 insertions(+), 2 deletions(-)

--
2.21.0



2019-07-25 20:44:07

by Brodie Greenfield

[permalink] [raw]
Subject: [PATCH 2/2] ip6mr: Make cache queue length configurable

We want to be able to keep more spaces available in our queue for
processing incoming IPv6 multicast traffic (adding (S,G) entries) - this
lets us learn more groups faster, rather than dropping them at this stage.

Signed-off-by: Brodie Greenfield <[email protected]>
---
Documentation/networking/ip-sysctl.txt | 8 ++++++++
include/net/netns/ipv6.h | 1 +
net/ipv6/af_inet6.c | 1 +
net/ipv6/ip6mr.c | 4 +++-
net/ipv6/sysctl_net_ipv6.c | 7 +++++++
5 files changed, 20 insertions(+), 1 deletion(-)

diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt
index 02f77e932adf..68eada3ca915 100644
--- a/Documentation/networking/ip-sysctl.txt
+++ b/Documentation/networking/ip-sysctl.txt
@@ -1481,6 +1481,14 @@ skip_notify_on_dev_down - BOOLEAN
on userspace caches to track link events and evict routes.
Default: false (generate message)

+ip6_mr_cache_queue_length - INTEGER
+ Limit the number of multicast packets we can have in the queue to be
+ resolved.
+ Bear in mind that when an unresolved multicast packet is received,
+ there is an O(n) traversal of the queue. This should be considered
+ if increasing.
+ Default: 10
+
IPv6 Fragmentation:

ip6frag_high_thresh - INTEGER
diff --git a/include/net/netns/ipv6.h b/include/net/netns/ipv6.h
index ef1ed529f33c..84b58424c799 100644
--- a/include/net/netns/ipv6.h
+++ b/include/net/netns/ipv6.h
@@ -46,6 +46,7 @@ struct netns_sysctl_ipv6 {
int max_hbh_opts_len;
int seg6_flowlabel;
bool skip_notify_on_dev_down;
+ unsigned int ip6_mr_cache_queue_length;
};

struct netns_ipv6 {
diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
index d99753b5e39b..6551bb63e5a2 100644
--- a/net/ipv6/af_inet6.c
+++ b/net/ipv6/af_inet6.c
@@ -856,6 +856,7 @@ static int __net_init inet6_net_init(struct net *net)
net->ipv6.sysctl.max_hbh_opts_cnt = IP6_DEFAULT_MAX_HBH_OPTS_CNT;
net->ipv6.sysctl.max_dst_opts_len = IP6_DEFAULT_MAX_DST_OPTS_LEN;
net->ipv6.sysctl.max_hbh_opts_len = IP6_DEFAULT_MAX_HBH_OPTS_LEN;
+ net->ipv6.sysctl.ip6_mr_cache_queue_length = 10;
atomic_set(&net->ipv6.fib6_sernum, 1);

err = ipv6_init_mibs(net);
diff --git a/net/ipv6/ip6mr.c b/net/ipv6/ip6mr.c
index cc01aa3f2b5e..bb445871437e 100644
--- a/net/ipv6/ip6mr.c
+++ b/net/ipv6/ip6mr.c
@@ -1135,6 +1135,7 @@ static int ip6mr_cache_report(struct mr_table *mrt, struct sk_buff *pkt,
static int ip6mr_cache_unresolved(struct mr_table *mrt, mifi_t mifi,
struct sk_buff *skb, struct net_device *dev)
{
+ struct net *net = dev_net(dev);
struct mfc6_cache *c;
bool found = false;
int err;
@@ -1153,7 +1154,8 @@ static int ip6mr_cache_unresolved(struct mr_table *mrt, mifi_t mifi,
* Create a new entry if allowable
*/

- if (atomic_read(&mrt->cache_resolve_queue_len) >= 10 ||
+ if (atomic_read(&mrt->cache_resolve_queue_len) >=
+ net->ipv6.sysctl.ip6_mr_cache_queue_length ||
(c = ip6mr_cache_alloc_unres()) == NULL) {
spin_unlock_bh(&mfc_unres_lock);

diff --git a/net/ipv6/sysctl_net_ipv6.c b/net/ipv6/sysctl_net_ipv6.c
index e15cd37024fd..a27299d4cc34 100644
--- a/net/ipv6/sysctl_net_ipv6.c
+++ b/net/ipv6/sysctl_net_ipv6.c
@@ -159,6 +159,13 @@ static struct ctl_table ipv6_table_template[] = {
.mode = 0644,
.proc_handler = proc_dointvec
},
+ {
+ .procname = "ip6_mr_cache_queue_length",
+ .data = &init_net.ipv6.sysctl.ip6_mr_cache_queue_length,
+ .maxlen = sizeof(int),
+ .mode = 0644,
+ .proc_handler = proc_dointvec
+ },
{ }
};

--
2.21.0


2019-07-26 10:11:55

by Stephen Suryaputra

[permalink] [raw]
Subject: Re: [PATCH 2/2] ip6mr: Make cache queue length configurable

On Fri, Jul 26, 2019 at 08:42:30AM +1200, Brodie Greenfield wrote:
> We want to be able to keep more spaces available in our queue for
> processing incoming IPv6 multicast traffic (adding (S,G) entries) - this
> lets us learn more groups faster, rather than dropping them at this stage.
>
> Signed-off-by: Brodie Greenfield <[email protected]>

Reviewed-by: Stephen Suryaputra <[email protected]>

2019-07-27 20:25:35

by David Miller

[permalink] [raw]
Subject: Re: [PATCH 0/2] Make ipmr queue length configurable

From: Brodie Greenfield <[email protected]>
Date: Fri, 26 Jul 2019 08:42:28 +1200

> We want to have some more space in our queue for processing incoming
> multicast packets, so we can process more of them without dropping
> them prematurely. It is useful to be able to increase this limit on
> higher-spec platforms that can handle more items.
>
> For the particular use case here at Allied Telesis, we have linux
> running on our switches and routers, with support for the number of
> multicast groups being increased. Basically, this queue length affects
> the time taken to fully learn all of the multicast streams.
>
> Changes in v3:
> - Corrected a v4 to v6 typo.

As others have voiced, I think it's dangerous to let every netns
increase this so readily.

We need to either put in a non-initns limit or simply not allow
non-init namespaces to change this.

But really socket queue limits are a better place to enforce this.