LinuxLists.cc - [PATCH net-next v4 1/2] New kernel function to get IP overhead on a socket.

2017-03-22 22:59:32

Subject: [PATCH net-next v4 1/2] New kernel function to get IP overhead on a socket.

A new function, kernel_sock_ip_overhead(), is provided
to calculate the cumulative overhead imposed by the IP
Header and IP options, if any, on a socket's payload.
The new function returns an overhead of zero for sockets
that do not belong to the IPv4 or IPv6 address families.

Signed-off-by: R. Parameswaran <[email protected]>
---
include/linux/net.h | 3 +++
net/socket.c | 44 ++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 47 insertions(+)

diff --git a/include/linux/net.h b/include/linux/net.h
index 0620f5e..a42fab2 100644
--- a/include/linux/net.h
+++ b/include/linux/net.h
@@ -298,6 +298,9 @@ int kernel_sendpage(struct socket *sock, struct page *page, int offset,
int kernel_sock_ioctl(struct socket *sock, int cmd, unsigned long arg);
int kernel_sock_shutdown(struct socket *sock, enum sock_shutdown_cmd how);

+/* Following routine returns the IP overhead imposed by a socket. */
+u32 kernel_sock_ip_overhead(struct sock *sk);
+
#define MODULE_ALIAS_NETPROTO(proto) \
MODULE_ALIAS("net-pf-" __stringify(proto))

diff --git a/net/socket.c b/net/socket.c
index e034fe4..69598e1 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -3345,3 +3345,47 @@ int kernel_sock_shutdown(struct socket *sock, enum sock_shutdown_cmd how)
return sock->ops->shutdown(sock, how);
}
EXPORT_SYMBOL(kernel_sock_shutdown);
+
+/* This routine returns the IP overhead imposed by a socket i.e.
+ * the length of the underlying IP header, depending on whether
+ * this is an IPv4 or IPv6 socket and the length from IP options turned
+ * on at the socket.
+ */
+u32 kernel_sock_ip_overhead(struct sock *sk)
+{
+ struct inet_sock *inet;
+ struct ipv6_pinfo *np;
+ struct ip_options_rcu *opt;
+ struct ipv6_txoptions *optv6 = NULL;
+ u32 overhead = 0;
+ bool owned_by_user;
+
+ if (!sk)
+ return overhead;
+
+ owned_by_user = sock_owned_by_user(sk);
+ switch (sk->sk_family) {
+ case AF_INET:
+ inet = inet_sk(sk);
+ overhead += sizeof(struct iphdr);
+ opt = rcu_dereference_protected(inet->inet_opt,
+ owned_by_user);
+ if (opt)
+ overhead += opt->opt.optlen;
+ return overhead;
+#if IS_ENABLED(CONFIG_IPV6)
+ case AF_INET6:
+ np = inet6_sk(sk);
+ overhead += sizeof(struct ipv6hdr);
+ if (np)
+ optv6 = rcu_dereference_protected(np->opt,
+ owned_by_user);
+ if (optv6)
+ overhead += (optv6->opt_flen + optv6->opt_nflen);
+ return overhead;
+#endif /* IS_ENABLED(CONFIG_IPV6) */
+ default: /* Returns 0 overhead if the socket is not ipv4 or ipv6 */
+ return overhead;
+ }
+}
+EXPORT_SYMBOL(kernel_sock_ip_overhead);
--
2.1.4

2017-03-22 23:04:17

by Tom Herbert

[permalink] [raw]

Subject: Re: [PATCH net-next v4 1/2] New kernel function to get IP overhead on a socket.

On Wed, Mar 22, 2017 at 3:59 PM, R. Parameswaran
<[email protected]> wrote:
>
> A new function, kernel_sock_ip_overhead(), is provided
> to calculate the cumulative overhead imposed by the IP
> Header and IP options, if any, on a socket's payload.
> The new function returns an overhead of zero for sockets
> that do not belong to the IPv4 or IPv6 address families.
>
Can you provide some context as to why this is needed?

Tom

> Signed-off-by: R. Parameswaran <[email protected]>
> ---
> include/linux/net.h | 3 +++
> net/socket.c | 44 ++++++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 47 insertions(+)
>
> diff --git a/include/linux/net.h b/include/linux/net.h
> index 0620f5e..a42fab2 100644
> --- a/include/linux/net.h
> +++ b/include/linux/net.h
> @@ -298,6 +298,9 @@ int kernel_sendpage(struct socket *sock, struct page *page, int offset,
> int kernel_sock_ioctl(struct socket *sock, int cmd, unsigned long arg);
> int kernel_sock_shutdown(struct socket *sock, enum sock_shutdown_cmd how);
>
> +/* Following routine returns the IP overhead imposed by a socket. */
> +u32 kernel_sock_ip_overhead(struct sock *sk);
> +
> #define MODULE_ALIAS_NETPROTO(proto) \
> MODULE_ALIAS("net-pf-" __stringify(proto))
>
> diff --git a/net/socket.c b/net/socket.c
> index e034fe4..69598e1 100644
> --- a/net/socket.c
> +++ b/net/socket.c
> @@ -3345,3 +3345,47 @@ int kernel_sock_shutdown(struct socket *sock, enum sock_shutdown_cmd how)
> return sock->ops->shutdown(sock, how);
> }
> EXPORT_SYMBOL(kernel_sock_shutdown);
> +
> +/* This routine returns the IP overhead imposed by a socket i.e.
> + * the length of the underlying IP header, depending on whether
> + * this is an IPv4 or IPv6 socket and the length from IP options turned
> + * on at the socket.
> + */
> +u32 kernel_sock_ip_overhead(struct sock *sk)
> +{
> + struct inet_sock *inet;
> + struct ipv6_pinfo *np;
> + struct ip_options_rcu *opt;
> + struct ipv6_txoptions *optv6 = NULL;
> + u32 overhead = 0;
> + bool owned_by_user;
> +
> + if (!sk)
> + return overhead;
> +
> + owned_by_user = sock_owned_by_user(sk);
> + switch (sk->sk_family) {
> + case AF_INET:
> + inet = inet_sk(sk);
> + overhead += sizeof(struct iphdr);
> + opt = rcu_dereference_protected(inet->inet_opt,
> + owned_by_user);
> + if (opt)
> + overhead += opt->opt.optlen;
> + return overhead;
> +#if IS_ENABLED(CONFIG_IPV6)
> + case AF_INET6:
> + np = inet6_sk(sk);
> + overhead += sizeof(struct ipv6hdr);
> + if (np)
> + optv6 = rcu_dereference_protected(np->opt,
> + owned_by_user);
> + if (optv6)
> + overhead += (optv6->opt_flen + optv6->opt_nflen);
> + return overhead;
> +#endif /* IS_ENABLED(CONFIG_IPV6) */
> + default: /* Returns 0 overhead if the socket is not ipv4 or ipv6 */
> + return overhead;
> + }
> +}
> +EXPORT_SYMBOL(kernel_sock_ip_overhead);
> --
> 2.1.4
>

2017-03-23 22:05:32

by David Miller

[permalink] [raw]

Subject: Re: [PATCH net-next v4 1/2] New kernel function to get IP overhead on a socket.

From: "R. Parameswaran" <[email protected]>
Date: Wed, 22 Mar 2017 15:59:13 -0700 (PDT)

> A new function, kernel_sock_ip_overhead(), is provided
> to calculate the cumulative overhead imposed by the IP
> Header and IP options, if any, on a socket's payload.
> The new function returns an overhead of zero for sockets
> that do not belong to the IPv4 or IPv6 address families.
>
> Signed-off-by: R. Parameswaran <[email protected]>

Just use the IPv4/IPv6 header size for now, just like the VXLAN
driver does.

Thanks.

2017-03-24 01:04:47

by kernel test robot

[permalink] [raw]

Subject: Re: [PATCH net-next v4 1/2] New kernel function to get IP overhead on a socket.

Hi Parameswaran,

[auto build test WARNING on net-next/master]

url: https://github.com/0day-ci/linux/commits/R-Parameswaran/New-kernel-function-to-get-IP-overhead-on-a-socket/20170324-065003
config: x86_64-randconfig-s1-03240701 (attached as .config)
compiler: gcc-4.4 (Debian 4.4.7-8) 4.4.7
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64

All warnings (new ones prefixed by >>):

net/socket.c: In function 'kernel_sock_ip_overhead':
>> net/socket.c:3359: warning: unused variable 'optv6'
>> net/socket.c:3357: warning: unused variable 'np'

vim +/optv6 +3359 net/socket.c

3351 * this is an IPv4 or IPv6 socket and the length from IP options turned
3352 * on at the socket.
3353 */
3354 u32 kernel_sock_ip_overhead(struct sock *sk)
3355 {
3356 struct inet_sock *inet;
> 3357 struct ipv6_pinfo *np;
3358 struct ip_options_rcu *opt;
> 3359 struct ipv6_txoptions *optv6 = NULL;
3360 u32 overhead = 0;
3361 bool owned_by_user;
3362

---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation

Attachments:

(No filename) (1.15 kB)
.config.gz (23.71 kB)
Download all attachments

2017-03-24 01:52:06

by R. Parameswaran

[permalink] [raw]

Subject: Re: [PATCH net-next v4 1/2] New kernel function to get IP overhead on a socket.

Hi Dave,

Please see inline:

On Thu, 23 Mar 2017, David Miller wrote:

> From: "R. Parameswaran" <[email protected]>
> Date: Wed, 22 Mar 2017 15:59:13 -0700 (PDT)
>
> > A new function, kernel_sock_ip_overhead(), is provided
> > to calculate the cumulative overhead imposed by the IP
> > Header and IP options, if any, on a socket's payload.
> > The new function returns an overhead of zero for sockets
> > that do not belong to the IPv4 or IPv6 address families.
> >
> > Signed-off-by: R. Parameswaran <[email protected]>
>
> Just use the IPv4/IPv6 header size for now, just like the VXLAN
> driver does.
>

Actually, that's how the original posting was - it was changed in
response to a review comment from James Chapman requesting the IP
Options overhead to be factored in and for this to be calculated in
a new standalone function that can be reused in other situations.
The review comment makes sense to me - the kernel seems to do a
good job of accounting for the cumulative size of IP Options and
if the information is available, it may make sense to factor it in.

I guess you are concerned about compatibility between vxlan and
L2TP? There may be one difference - the socket for vxlan
appears to be opened/controlled entirely within kernel code (seems
to call udp_sock_create() which does not appear to turn on any options),
but in the case of L2TP, it is possible for the tunnel socket to be
opened from user space, if a user space control plane daemon is running.
Regardless of how user space daemons are written right now, it is
possible in theory for the user space code to turn on options on the
L2TP tunnel socket. So it seems that IP options might be enabled on the
L2TP socket, but are probably unlikely on the vxlan socket?

I'd suggest giving this a few days for James to respond.
At that time if there is agreement that we don't need to factor options,
I can rework it.

thanks,

Ramkumar

> Thanks.
>

2017-03-24 13:13:49

by James Chapman

[permalink] [raw]

Subject: Re: [PATCH net-next v4 1/2] New kernel function to get IP overhead on a socket.

On 24/03/17 01:51, R. Parameswaran wrote:
> Hi Dave,
>
> Please see inline:
>
> On Thu, 23 Mar 2017, David Miller wrote:
>
>> From: "R. Parameswaran" <[email protected]>
>> Date: Wed, 22 Mar 2017 15:59:13 -0700 (PDT)
>>
>>> A new function, kernel_sock_ip_overhead(), is provided
>>> to calculate the cumulative overhead imposed by the IP
>>> Header and IP options, if any, on a socket's payload.
>>> The new function returns an overhead of zero for sockets
>>> that do not belong to the IPv4 or IPv6 address families.
>>>
>>> Signed-off-by: R. Parameswaran <[email protected]>
>> Just use the IPv4/IPv6 header size for now, just like the VXLAN
>> driver does.
>>
> Actually, that's how the original posting was - it was changed in
> response to a review comment from James Chapman requesting the IP
> Options overhead to be factored in and for this to be calculated in
> a new standalone function that can be reused in other situations.
> The review comment makes sense to me - the kernel seems to do a
> good job of accounting for the cumulative size of IP Options and
> if the information is available, it may make sense to factor it in.
>
> I guess you are concerned about compatibility between vxlan and
> L2TP? There may be one difference - the socket for vxlan
> appears to be opened/controlled entirely within kernel code (seems
> to call udp_sock_create() which does not appear to turn on any options),
> but in the case of L2TP, it is possible for the tunnel socket to be
> opened from user space, if a user space control plane daemon is running.
> Regardless of how user space daemons are written right now, it is
> possible in theory for the user space code to turn on options on the
> L2TP tunnel socket. So it seems that IP options might be enabled on the
> L2TP socket, but are probably unlikely on the vxlan socket?
>
> I'd suggest giving this a few days for James to respond.
> At that time if there is agreement that we don't need to factor options,
> I can rework it.

The reason I suggested factoring in IP options here is because L2TP
tunnel sockets can be opened by userspace daemons. When an L2TP control
plane is used, the L2TP daemon opens the tunnel socket. When no control
plane is used, L2TP connections are manually configured, usually by ip
l2tp commands, and the tunnel socket is created by the kernel.

For sockets created by userspace, the L2TP daemon could derive the
cumulative size of all IP options that it sets and use that to set the
L2TP session's MTU. Setting an MTU when establishing L2TP connections
overrides the kernel's own default MTU calculations anyway. But the L2TP
packet header overhead depends on several other L2TP-specific options,
e.g. cookie size and the kernel already takes these into account when
calculating MTU. If IP options are ignored, we'd have the situation
where the default MTU value avoids fragmentation but only if certain IP
options are not enabled on the tunnel socket.

For L2TP, I'd prefer that the kernel takes IP options into account. If a
function were added to the kernel to return the IP header overhead of an
IP socket, initially for L2TP, I figured it might be useful for other
tunnel protocols too, hence I suggested it was added as a generic
function. Maybe that was a mistake. If instead this function is renamed
and made L2TP-private, that would be ok for me.

James

2017-04-03 20:28:15

by R. Parameswaran

[permalink] [raw]

Subject: Re: [PATCH net-next v4 1/2] New kernel function to get IP overhead on a socket.

Hi James, Dave,

Sorry for the delay (was away), please see inline:

On Fri, 24 Mar 2017, James Chapman wrote:

> On 24/03/17 01:51, R. Parameswaran wrote:
> > Hi Dave,
> >
> > Please see inline:
> >
> > On Thu, 23 Mar 2017, David Miller wrote:
> >
> >> From: "R. Parameswaran" <[email protected]>
> >> Date: Wed, 22 Mar 2017 15:59:13 -0700 (PDT)
> >>
> >>> A new function, kernel_sock_ip_overhead(), is provided
> >>> to calculate the cumulative overhead imposed by the IP
> >>> Header and IP options, if any, on a socket's payload.
> >>> The new function returns an overhead of zero for sockets
> >>> that do not belong to the IPv4 or IPv6 address families.
> >>>
> >>> Signed-off-by: R. Parameswaran <[email protected]>
> >> Just use the IPv4/IPv6 header size for now, just like the VXLAN
> >> driver does.
> >>
> > Actually, that's how the original posting was - it was changed in
> > response to a review comment from James Chapman requesting the IP
> > Options overhead to be factored in and for this to be calculated in
> > a new standalone function that can be reused in other situations.
> > The review comment makes sense to me - the kernel seems to do a
> > good job of accounting for the cumulative size of IP Options and
> > if the information is available, it may make sense to factor it in.
> >
> > I guess you are concerned about compatibility between vxlan and
> > L2TP? There may be one difference - the socket for vxlan
> > appears to be opened/controlled entirely within kernel code (seems
> > to call udp_sock_create() which does not appear to turn on any options),
> > but in the case of L2TP, it is possible for the tunnel socket to be
> > opened from user space, if a user space control plane daemon is running.
> > Regardless of how user space daemons are written right now, it is
> > possible in theory for the user space code to turn on options on the
> > L2TP tunnel socket. So it seems that IP options might be enabled on the
> > L2TP socket, but are probably unlikely on the vxlan socket?
> >
> > I'd suggest giving this a few days for James to respond.
> > At that time if there is agreement that we don't need to factor options,
> > I can rework it.
>
> The reason I suggested factoring in IP options here is because L2TP
> tunnel sockets can be opened by userspace daemons. When an L2TP control
> plane is used, the L2TP daemon opens the tunnel socket. When no control
> plane is used, L2TP connections are manually configured, usually by ip
> l2tp commands, and the tunnel socket is created by the kernel.
>
> For sockets created by userspace, the L2TP daemon could derive the
> cumulative size of all IP options that it sets and use that to set the
> L2TP session's MTU. Setting an MTU when establishing L2TP connections
> overrides the kernel's own default MTU calculations anyway. But the L2TP
> packet header overhead depends on several other L2TP-specific options,
> e.g. cookie size and the kernel already takes these into account when
> calculating MTU. If IP options are ignored, we'd have the situation
> where the default MTU value avoids fragmentation but only if certain IP
> options are not enabled on the tunnel socket.
>
> For L2TP, I'd prefer that the kernel takes IP options into account. If a
> function were added to the kernel to return the IP header overhead of an
> IP socket, initially for L2TP, I figured it might be useful for other
> tunnel protocols too, hence I suggested it was added as a generic
> function. Maybe that was a mistake. If instead this function is renamed
> and made L2TP-private, that would be ok for me.
>
> James
>

Can I take this to mean that we do need to factor in IP options in
the L2TP device MTU setup (i.e approach in the posted patch is okay)?

If yes, please let me know if I can keep the socket IP option overhead
calculations in a generic function, or it would be better to move it back into
L2TP code?

There are a couple of krobot warnings which I can fix on the next upload.

thanks,

Ramkumar

>
>

2017-04-03 20:30:45

by David Miller

[permalink] [raw]

Subject: Re: [PATCH net-next v4 1/2] New kernel function to get IP overhead on a socket.

From: "R. Parameswaran" <[email protected]>
Date: Mon, 3 Apr 2017 13:28:11 -0700 (PDT)

> Can I take this to mean that we do need to factor in IP options in
> the L2TP device MTU setup (i.e approach in the posted patch is okay)?
>
> If yes, please let me know if I can keep the socket IP option overhead
> calculations in a generic function, or it would be better to move it back into
> L2TP code?

If the user creates and maintains this UDP socket, then yes we have to
account for potential IP options.

2017-04-04 02:12:23

by R. Parameswaran

[permalink] [raw]

Subject: Re: [PATCH net-next v4 1/2] New kernel function to get IP overhead on a socket.

Hi Dave,

Please see inline:

On Mon, 3 Apr 2017, David Miller wrote:

> From: "R. Parameswaran" <[email protected]>
> Date: Mon, 3 Apr 2017 13:28:11 -0700 (PDT)
>
> > Can I take this to mean that we do need to factor in IP options in
> > the L2TP device MTU setup (i.e approach in the posted patch is okay)?
> >
> > If yes, please let me know if I can keep the socket IP option overhead
> > calculations in a generic function, or it would be better to move it back into
> > L2TP code?
>
> If the user creates and maintains this UDP socket, then yes we have to
> account for potential IP options.
>

Can I take this to mean that the patch in its present form is
acceptable (patch currently accounts for IP options on the socket)?
Please let me know if any further change is needed (I'll clean up the
krobot reported errors after this).

thanks,

Ramkumar

2017-04-04 02:19:22

by David Miller

[permalink] [raw]

Subject: Re: [PATCH net-next v4 1/2] New kernel function to get IP overhead on a socket.

From: "R. Parameswaran" <[email protected]>
Date: Mon, 3 Apr 2017 19:12:20 -0700 (PDT)

>
>
> Hi Dave,
>
> Please see inline:
>
> On Mon, 3 Apr 2017, David Miller wrote:
>
>> From: "R. Parameswaran" <[email protected]>
>> Date: Mon, 3 Apr 2017 13:28:11 -0700 (PDT)
>>
>> > Can I take this to mean that we do need to factor in IP options in
>> > the L2TP device MTU setup (i.e approach in the posted patch is okay)?
>> >
>> > If yes, please let me know if I can keep the socket IP option overhead
>> > calculations in a generic function, or it would be better to move it back into
>> > L2TP code?
>>
>> If the user creates and maintains this UDP socket, then yes we have to
>> account for potential IP options.
>>
>
> Can I take this to mean that the patch in its present form is
> acceptable (patch currently accounts for IP options on the socket)?
> Please let me know if any further change is needed (I'll clean up the
> krobot reported errors after this).

Yes, please respin the patch with the krobot errors fixed.