2023-08-09 01:28:41

by Tahsin Erdogan

[permalink] [raw]
Subject: [PATCH v3] tun: avoid high-order page allocation for packet header

When GSO is not enabled and a packet is transmitted via writev(), all
payload is treated as header which requires a contiguous memory allocation.
This allocation request is harder to satisfy, and may even fail if there is
enough fragmentation.

Note that sendmsg() code path limits the linear copy length, so this change
makes writev() and sendmsg() more consistent.

Signed-off-by: Tahsin Erdogan <[email protected]>
---
v3: rebase to latest net-next
v2: replace linear == 0 with !linear
v1: https://lore.kernel.org/all/[email protected]/
drivers/net/tun.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 5beb6b5dd7e5..53d19c958a20 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1523,7 +1523,7 @@ static struct sk_buff *tun_alloc_skb(struct tun_file *tfile,
int err;

/* Under a page? Don't bother with paged skb. */
- if (prepad + len < PAGE_SIZE || !linear)
+ if (prepad + len < PAGE_SIZE)
linear = len;

if (len - linear > MAX_SKB_FRAGS * (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER))
@@ -1913,6 +1913,9 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
*/
zerocopy = false;
} else {
+ if (!linear)
+ linear = min_t(size_t, good_linear, copylen);
+
skb = tun_alloc_skb(tfile, align, copylen, linear,
noblock);
}
--
2.41.0



2023-08-09 13:31:58

by Willem de Bruijn

[permalink] [raw]
Subject: RE: [PATCH v3] tun: avoid high-order page allocation for packet header

Tahsin Erdogan wrote:
> When GSO is not enabled

Not GSO, but gso.hdr_len, which is a feature of IFF_VNET_HDR.

VIRTIO_NET_HDR_GSO_* does not need to be enabled to use the
header length field.

> and a packet is transmitted via writev(), all
> payload is treated as header which requires a contiguous memory allocation.
> This allocation request is harder to satisfy, and may even fail if there is
> enough fragmentation.
>
> Note that sendmsg() code path limits the linear copy length, so this change
> makes writev() and sendmsg() more consistent.

This is not specific to writev(), equally to more common write().

Tun sendmsg is a special case, only used by vhost-net from inside the
kernel. Arguably consistency with packet_snd/packet_alloc_skb would be
more important. That said, this makes sense to me. I assume your
configuring a device with very large MTU?


> Signed-off-by: Tahsin Erdogan <[email protected]>
> ---
> v3: rebase to latest net-next
> v2: replace linear == 0 with !linear
> v1: https://lore.kernel.org/all/[email protected]/
> drivers/net/tun.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> index 5beb6b5dd7e5..53d19c958a20 100644
> --- a/drivers/net/tun.c
> +++ b/drivers/net/tun.c
> @@ -1523,7 +1523,7 @@ static struct sk_buff *tun_alloc_skb(struct tun_file *tfile,
> int err;
>
> /* Under a page? Don't bother with paged skb. */
> - if (prepad + len < PAGE_SIZE || !linear)
> + if (prepad + len < PAGE_SIZE)
> linear = len;
>
> if (len - linear > MAX_SKB_FRAGS * (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER))
> @@ -1913,6 +1913,9 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
> */
> zerocopy = false;
> } else {
> + if (!linear)
> + linear = min_t(size_t, good_linear, copylen);
> +
> skb = tun_alloc_skb(tfile, align, copylen, linear,
> noblock);
> }
> --
> 2.41.0
>



2023-08-09 15:30:20

by Willem de Bruijn

[permalink] [raw]
Subject: Re: [PATCH v3] tun: avoid high-order page allocation for packet header

Erdogan, Tahsin wrote:
> On Wed, 2023-08-09 at 09:18 -0400, Willem de Bruijn wrote:
> > Tun sendmsg is a special case, only used by vhost-net from inside the
> > kernel. Arguably consistency with packet_snd/packet_alloc_skb would
> > be
> > more important. That said, this makes sense to me. I assume your
> > configuring a device with very large MTU?
>
> That's right. I am setting MTU to 9100 in my test.

Makes sense. That's not even that large.

Please address the commit message points about virtio_net_hdr.hdr_len
and write() vs writev().

A writev() specific solution could even take the first iov length as
hint. Note that I'm not suggesting that. IFF_NAPI_FRAGS already does
exactly that, plus the geometry of subsequent frags.

2023-08-09 15:52:20

by Tahsin Erdogan

[permalink] [raw]
Subject: Re: [PATCH v3] tun: avoid high-order page allocation for packet header

On Wed, 2023-08-09 at 09:18 -0400, Willem de Bruijn wrote:
> Tun sendmsg is a special case, only used by vhost-net from inside the
> kernel. Arguably consistency with packet_snd/packet_alloc_skb would
> be
> more important. That said, this makes sense to me. I assume your
> configuring a device with very large MTU?

That's right. I am setting MTU to 9100 in my test.

2023-08-09 16:16:42

by Tahsin Erdogan

[permalink] [raw]
Subject: Re: [PATCH v3] tun: avoid high-order page allocation for packet header

On Wed, 2023-08-09 at 10:01 -0400, Willem de Bruijn wrote:
> Erdogan, Tahsin wrote:
> > On Wed, 2023-08-09 at 09:18 -0400, Willem de Bruijn wrote:
> > > Tun sendmsg is a special case, only used by vhost-net from inside
> > > the
> > > kernel. Arguably consistency with packet_snd/packet_alloc_skb
> > > would
> > > be
> > > more important. That said, this makes sense to me. I assume your
> > > configuring a device with very large MTU?
> >
> > That's right. I am setting MTU to 9100 in my test.
>
> Makes sense. That's not even that large.
>
> Please address the commit message points about virtio_net_hdr.hdr_len
> and write() vs writev().

Yes, I will update commit message in v4. thank you.