2019-03-07 04:58:15

by Brodie Greenfield

[permalink] [raw]
Subject: [PATCH 0/2] Make ipmr queue length configurable

We want to have some more space in our queue for processing incoming
multicast packets, so we can process more of them without dropping
them prematurely. It is useful to be able to increase this limit on
higher-spec platforms that can handle more items.

For the particular use case here at Allied Telesis, we have linux
running on our switches and routers, with support for the number of
multicast groups being increased. Basically, this queue length affects
the time taken to fully learn all of the multicast streams.

Changes in v3:
- Corrected a v4 to v6 typo.

Changes in v2:
- Tidy up a few unnecessary bits of code.
- Put the sysctl inside ip multicast ifdef.
- Included the IPv6 version.

Brodie Greenfield (2):
ipmr: Make cache queue length configurable
ip6mr: Make cache queue length configurable

Documentation/networking/ip-sysctl.txt | 16 ++++++++++++++++
include/net/netns/ipv4.h | 1 +
include/net/netns/ipv6.h | 1 +
net/ipv4/af_inet.c | 1 +
net/ipv4/ipmr.c | 4 +++-
net/ipv4/sysctl_net_ipv4.c | 7 +++++++
net/ipv6/af_inet6.c | 1 +
net/ipv6/ip6mr.c | 4 +++-
net/ipv6/sysctl_net_ipv6.c | 7 +++++++
9 files changed, 40 insertions(+), 2 deletions(-)

--
2.21.0



2019-03-07 04:58:26

by Brodie Greenfield

[permalink] [raw]
Subject: [PATCH 2/2] ip6mr: Make cache queue length configurable

We want to be able to keep more spaces available in our queue for
processing incoming IPv6 multicast traffic (adding (S,G) entries) - this
lets us learn more groups faster, rather than dropping them at this stage.

Signed-off-by: Brodie Greenfield <[email protected]>
---
Documentation/networking/ip-sysctl.txt | 8 ++++++++
include/net/netns/ipv6.h | 1 +
net/ipv6/af_inet6.c | 1 +
net/ipv6/ip6mr.c | 4 +++-
net/ipv6/sysctl_net_ipv6.c | 7 +++++++
5 files changed, 20 insertions(+), 1 deletion(-)

diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt
index 02f77e932adf..68eada3ca915 100644
--- a/Documentation/networking/ip-sysctl.txt
+++ b/Documentation/networking/ip-sysctl.txt
@@ -1481,6 +1481,14 @@ skip_notify_on_dev_down - BOOLEAN
on userspace caches to track link events and evict routes.
Default: false (generate message)

+ip6_mr_cache_queue_length - INTEGER
+ Limit the number of multicast packets we can have in the queue to be
+ resolved.
+ Bear in mind that when an unresolved multicast packet is received,
+ there is an O(n) traversal of the queue. This should be considered
+ if increasing.
+ Default: 10
+
IPv6 Fragmentation:

ip6frag_high_thresh - INTEGER
diff --git a/include/net/netns/ipv6.h b/include/net/netns/ipv6.h
index ef1ed529f33c..84b58424c799 100644
--- a/include/net/netns/ipv6.h
+++ b/include/net/netns/ipv6.h
@@ -46,6 +46,7 @@ struct netns_sysctl_ipv6 {
int max_hbh_opts_len;
int seg6_flowlabel;
bool skip_notify_on_dev_down;
+ unsigned int ip6_mr_cache_queue_length;
};

struct netns_ipv6 {
diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
index d99753b5e39b..6551bb63e5a2 100644
--- a/net/ipv6/af_inet6.c
+++ b/net/ipv6/af_inet6.c
@@ -856,6 +856,7 @@ static int __net_init inet6_net_init(struct net *net)
net->ipv6.sysctl.max_hbh_opts_cnt = IP6_DEFAULT_MAX_HBH_OPTS_CNT;
net->ipv6.sysctl.max_dst_opts_len = IP6_DEFAULT_MAX_DST_OPTS_LEN;
net->ipv6.sysctl.max_hbh_opts_len = IP6_DEFAULT_MAX_HBH_OPTS_LEN;
+ net->ipv6.sysctl.ip6_mr_cache_queue_length = 10;
atomic_set(&net->ipv6.fib6_sernum, 1);

err = ipv6_init_mibs(net);
diff --git a/net/ipv6/ip6mr.c b/net/ipv6/ip6mr.c
index cc01aa3f2b5e..bb445871437e 100644
--- a/net/ipv6/ip6mr.c
+++ b/net/ipv6/ip6mr.c
@@ -1135,6 +1135,7 @@ static int ip6mr_cache_report(struct mr_table *mrt, struct sk_buff *pkt,
static int ip6mr_cache_unresolved(struct mr_table *mrt, mifi_t mifi,
struct sk_buff *skb, struct net_device *dev)
{
+ struct net *net = dev_net(dev);
struct mfc6_cache *c;
bool found = false;
int err;
@@ -1153,7 +1154,8 @@ static int ip6mr_cache_unresolved(struct mr_table *mrt, mifi_t mifi,
* Create a new entry if allowable
*/

- if (atomic_read(&mrt->cache_resolve_queue_len) >= 10 ||
+ if (atomic_read(&mrt->cache_resolve_queue_len) >=
+ net->ipv6.sysctl.ip6_mr_cache_queue_length ||
(c = ip6mr_cache_alloc_unres()) == NULL) {
spin_unlock_bh(&mfc_unres_lock);

diff --git a/net/ipv6/sysctl_net_ipv6.c b/net/ipv6/sysctl_net_ipv6.c
index e15cd37024fd..a27299d4cc34 100644
--- a/net/ipv6/sysctl_net_ipv6.c
+++ b/net/ipv6/sysctl_net_ipv6.c
@@ -159,6 +159,13 @@ static struct ctl_table ipv6_table_template[] = {
.mode = 0644,
.proc_handler = proc_dointvec
},
+ {
+ .procname = "ip6_mr_cache_queue_length",
+ .data = &init_net.ipv6.sysctl.ip6_mr_cache_queue_length,
+ .maxlen = sizeof(int),
+ .mode = 0644,
+ .proc_handler = proc_dointvec
+ },
{ }
};

--
2.21.0


2019-03-07 05:01:47

by Brodie Greenfield

[permalink] [raw]
Subject: [PATCH 1/2] ipmr: Make cache queue length configurable

We want to be able to keep more spaces available in our queue for
processing incoming multicast traffic (adding (S,G) entries) - this lets
us learn more groups faster, rather than dropping them at this stage.

Signed-off-by: Brodie Greenfield <[email protected]>
---
Documentation/networking/ip-sysctl.txt | 8 ++++++++
include/net/netns/ipv4.h | 1 +
net/ipv4/af_inet.c | 1 +
net/ipv4/ipmr.c | 4 +++-
net/ipv4/sysctl_net_ipv4.c | 7 +++++++
5 files changed, 20 insertions(+), 1 deletion(-)

diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt
index acdfb5d2bcaa..02f77e932adf 100644
--- a/Documentation/networking/ip-sysctl.txt
+++ b/Documentation/networking/ip-sysctl.txt
@@ -887,6 +887,14 @@ ip_local_reserved_ports - list of comma separated ranges

Default: Empty

+ip_mr_cache_queue_length - INTEGER
+ Limit the number of multicast packets we can have in the queue to be
+ resolved.
+ Bear in mind that when an unresolved multicast packet is received,
+ there is an O(n) traversal of the queue. This should be considered
+ if increasing.
+ Default: 10
+
ip_unprivileged_port_start - INTEGER
This is a per-namespace sysctl. It defines the first
unprivileged port in the network namespace. Privileged ports
diff --git a/include/net/netns/ipv4.h b/include/net/netns/ipv4.h
index 104a6669e344..3411d3f18d51 100644
--- a/include/net/netns/ipv4.h
+++ b/include/net/netns/ipv4.h
@@ -187,6 +187,7 @@ struct netns_ipv4 {
int sysctl_igmp_max_msf;
int sysctl_igmp_llm_reports;
int sysctl_igmp_qrv;
+ unsigned int sysctl_ip_mr_cache_queue_length;

struct ping_group_range ping_group_range;

diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index 0dfb72c46671..8e25538bdb1e 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -1827,6 +1827,7 @@ static __net_init int inet_init_net(struct net *net)
net->ipv4.sysctl_igmp_llm_reports = 1;
net->ipv4.sysctl_igmp_qrv = 2;

+ net->ipv4.sysctl_ip_mr_cache_queue_length = 10;
return 0;
}

diff --git a/net/ipv4/ipmr.c b/net/ipv4/ipmr.c
index ddbf8c9a1abb..c6a6c3e453a9 100644
--- a/net/ipv4/ipmr.c
+++ b/net/ipv4/ipmr.c
@@ -1127,6 +1127,7 @@ static int ipmr_cache_unresolved(struct mr_table *mrt, vifi_t vifi,
struct sk_buff *skb, struct net_device *dev)
{
const struct iphdr *iph = ip_hdr(skb);
+ struct net *net = dev_net(dev);
struct mfc_cache *c;
bool found = false;
int err;
@@ -1142,7 +1143,8 @@ static int ipmr_cache_unresolved(struct mr_table *mrt, vifi_t vifi,

if (!found) {
/* Create a new entry if allowable */
- if (atomic_read(&mrt->cache_resolve_queue_len) >= 10 ||
+ if (atomic_read(&mrt->cache_resolve_queue_len) >=
+ net->ipv4.sysctl_ip_mr_cache_queue_length ||
(c = ipmr_cache_alloc_unres()) == NULL) {
spin_unlock_bh(&mfc_unres_lock);

diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
index ba0fc4b18465..78ae86e8c6cb 100644
--- a/net/ipv4/sysctl_net_ipv4.c
+++ b/net/ipv4/sysctl_net_ipv4.c
@@ -784,6 +784,13 @@ static struct ctl_table ipv4_net_table[] = {
.proc_handler = proc_dointvec
},
#ifdef CONFIG_IP_MULTICAST
+ {
+ .procname = "ip_mr_cache_queue_length",
+ .data = &init_net.ipv4.sysctl_ip_mr_cache_queue_length,
+ .maxlen = sizeof(int),
+ .mode = 0644,
+ .proc_handler = proc_dointvec
+ },
{
.procname = "igmp_qrv",
.data = &init_net.ipv4.sysctl_igmp_qrv,
--
2.21.0


2019-03-08 23:26:24

by David Miller

[permalink] [raw]
Subject: Re: [PATCH 0/2] Make ipmr queue length configurable

From: Brodie Greenfield <[email protected]>
Date: Thu, 7 Mar 2019 17:57:27 +1300

> We want to have some more space in our queue for processing incoming
> multicast packets, so we can process more of them without dropping
> them prematurely. It is useful to be able to increase this limit on
> higher-spec platforms that can handle more items.
>
> For the particular use case here at Allied Telesis, we have linux
> running on our switches and routers, with support for the number of
> multicast groups being increased. Basically, this queue length affects
> the time taken to fully learn all of the multicast streams.
>
> Changes in v3:
> - Corrected a v4 to v6 typo.

This is a new feature, and therefore net-next material. The net-next
tree is closed, please resubmit when it opens back up.