2022-09-23 16:55:58

by Pavel Begunkov

[permalink] [raw]
Subject: [PATCH net-next 0/4] shrink struct ubuf_info

struct ubuf_info is large but not all fields are needed for all
cases. We have limited space in io_uring for it and large ubuf_info
prevents some struct embedding, even though we use only a subset
of the fields. It's also not very clean trying to use this typeless
extra space.

Shrink struct ubuf_info to only necessary fields used in generic paths,
namely ->callback, ->refcnt and ->flags, which take only 16 bytes. And
make MSG_ZEROCOPY and some other users to embed it into a larger struct
ubuf_info_msgzc mimicking the former ubuf_info.

Note, xen/vhost may also have some cleaning on top by creating
new structs containing ubuf_info but with proper types.

Pavel Begunkov (4):
net: introduce struct ubuf_info_msgzc
xen/netback: use struct ubuf_info_msgzc
vhost/net: use struct ubuf_info_msgzc
net: shrink struct ubuf_info

drivers/net/xen-netback/common.h | 2 +-
drivers/net/xen-netback/interface.c | 4 +--
drivers/net/xen-netback/netback.c | 7 +++---
drivers/vhost/net.c | 15 ++++++------
include/linux/skbuff.h | 11 +++++++--
net/core/skbuff.c | 38 ++++++++++++++++-------------
net/ipv4/ip_output.c | 2 +-
net/ipv4/tcp.c | 2 +-
net/ipv6/ip6_output.c | 2 +-
9 files changed, 48 insertions(+), 35 deletions(-)

--
2.37.2


2022-09-23 16:56:32

by Pavel Begunkov

[permalink] [raw]
Subject: [PATCH net-next 2/4] xen/netback: use struct ubuf_info_msgzc

struct ubuf_info will be changed, use ubuf_info_msgzc instead.

Signed-off-by: Pavel Begunkov <[email protected]>
---
drivers/net/xen-netback/common.h | 2 +-
drivers/net/xen-netback/interface.c | 4 ++--
drivers/net/xen-netback/netback.c | 7 ++++---
3 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index 8174d7b2966c..1545cbee77a4 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -62,7 +62,7 @@ struct pending_tx_info {
* ubuf_to_vif is a helper which finds the struct xenvif from a pointer
* to this field.
*/
- struct ubuf_info callback_struct;
+ struct ubuf_info_msgzc callback_struct;
};

#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, XEN_PAGE_SIZE)
diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
index fb32ae82d9b0..e579ecd40b74 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -591,8 +591,8 @@ int xenvif_init_queue(struct xenvif_queue *queue)
}

for (i = 0; i < MAX_PENDING_REQS; i++) {
- queue->pending_tx_info[i].callback_struct = (struct ubuf_info)
- { .callback = xenvif_zerocopy_callback,
+ queue->pending_tx_info[i].callback_struct = (struct ubuf_info_msgzc)
+ { { .callback = xenvif_zerocopy_callback },
{ { .ctx = NULL,
.desc = i } } };
queue->grant_tx_handle[i] = NETBACK_INVALID_HANDLE;
diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index a256695fc89e..3d2081bbbc86 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -133,7 +133,7 @@ static inline unsigned long idx_to_kaddr(struct xenvif_queue *queue,

/* Find the containing VIF's structure from a pointer in pending_tx_info array
*/
-static inline struct xenvif_queue *ubuf_to_queue(const struct ubuf_info *ubuf)
+static inline struct xenvif_queue *ubuf_to_queue(const struct ubuf_info_msgzc *ubuf)
{
u16 pending_idx = ubuf->desc;
struct pending_tx_info *temp =
@@ -1228,11 +1228,12 @@ static int xenvif_tx_submit(struct xenvif_queue *queue)
return work_done;
}

-void xenvif_zerocopy_callback(struct sk_buff *skb, struct ubuf_info *ubuf,
+void xenvif_zerocopy_callback(struct sk_buff *skb, struct ubuf_info *ubuf_base,
bool zerocopy_success)
{
unsigned long flags;
pending_ring_idx_t index;
+ struct ubuf_info_msgzc *ubuf = uarg_to_msgzc(ubuf_base);
struct xenvif_queue *queue = ubuf_to_queue(ubuf);

/* This is the only place where we grab this lock, to protect callbacks
@@ -1241,7 +1242,7 @@ void xenvif_zerocopy_callback(struct sk_buff *skb, struct ubuf_info *ubuf,
spin_lock_irqsave(&queue->callback_lock, flags);
do {
u16 pending_idx = ubuf->desc;
- ubuf = (struct ubuf_info *) ubuf->ctx;
+ ubuf = (struct ubuf_info_msgzc *) ubuf->ctx;
BUG_ON(queue->dealloc_prod - queue->dealloc_cons >=
MAX_PENDING_REQS);
index = pending_index(queue->dealloc_prod);
--
2.37.2

2022-09-23 16:57:53

by Pavel Begunkov

[permalink] [raw]
Subject: [PATCH net-next 4/4] net: shrink struct ubuf_info

We can benefit from a smaller struct ubuf_info, so leave only mandatory
fields and let users to decide how they want to extend it. Convert
MSG_ZEROCOPY to struct ubuf_info_msgzc and remove duplicated fields.
This reduces the size from 48 bytes to just 16.

Signed-off-by: Pavel Begunkov <[email protected]>
---
include/linux/skbuff.h | 22 ++++------------------
net/core/skbuff.c | 38 +++++++++++++++++++++-----------------
net/ipv4/ip_output.c | 2 +-
net/ipv4/tcp.c | 2 +-
net/ipv6/ip6_output.c | 2 +-
5 files changed, 28 insertions(+), 38 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index fd7dcb977fdf..920eb6413fee 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -533,25 +533,8 @@ enum {
struct ubuf_info {
void (*callback)(struct sk_buff *, struct ubuf_info *,
bool zerocopy_success);
- union {
- struct {
- unsigned long desc;
- void *ctx;
- };
- struct {
- u32 id;
- u16 len;
- u16 zerocopy:1;
- u32 bytelen;
- };
- };
refcount_t refcnt;
u8 flags;
-
- struct mmpin {
- struct user_struct *user;
- unsigned int num_pg;
- } mmp;
};

struct ubuf_info_msgzc {
@@ -570,7 +553,10 @@ struct ubuf_info_msgzc {
};
};

- struct mmpin mmp;
+ struct mmpin {
+ struct user_struct *user;
+ unsigned int num_pg;
+ } mmp;
};

#define skb_uarg(SKB) ((struct ubuf_info *)(skb_shinfo(SKB)->destructor_arg))
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index f1b8b20fc20b..bbcfb1c7f59e 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -1188,7 +1188,7 @@ EXPORT_SYMBOL_GPL(mm_unaccount_pinned_pages);

static struct ubuf_info *msg_zerocopy_alloc(struct sock *sk, size_t size)
{
- struct ubuf_info *uarg;
+ struct ubuf_info_msgzc *uarg;
struct sk_buff *skb;

WARN_ON_ONCE(!in_task());
@@ -1206,19 +1206,19 @@ static struct ubuf_info *msg_zerocopy_alloc(struct sock *sk, size_t size)
return NULL;
}

- uarg->callback = msg_zerocopy_callback;
+ uarg->ubuf.callback = msg_zerocopy_callback;
uarg->id = ((u32)atomic_inc_return(&sk->sk_zckey)) - 1;
uarg->len = 1;
uarg->bytelen = size;
uarg->zerocopy = 1;
- uarg->flags = SKBFL_ZEROCOPY_FRAG | SKBFL_DONT_ORPHAN;
- refcount_set(&uarg->refcnt, 1);
+ uarg->ubuf.flags = SKBFL_ZEROCOPY_FRAG | SKBFL_DONT_ORPHAN;
+ refcount_set(&uarg->ubuf.refcnt, 1);
sock_hold(sk);

- return uarg;
+ return &uarg->ubuf;
}

-static inline struct sk_buff *skb_from_uarg(struct ubuf_info *uarg)
+static inline struct sk_buff *skb_from_uarg(struct ubuf_info_msgzc *uarg)
{
return container_of((void *)uarg, struct sk_buff, cb);
}
@@ -1227,6 +1227,7 @@ struct ubuf_info *msg_zerocopy_realloc(struct sock *sk, size_t size,
struct ubuf_info *uarg)
{
if (uarg) {
+ struct ubuf_info_msgzc *uarg_zc;
const u32 byte_limit = 1 << 19; /* limit to a few TSO */
u32 bytelen, next;

@@ -1242,8 +1243,9 @@ struct ubuf_info *msg_zerocopy_realloc(struct sock *sk, size_t size,
return NULL;
}

- bytelen = uarg->bytelen + size;
- if (uarg->len == USHRT_MAX - 1 || bytelen > byte_limit) {
+ uarg_zc = uarg_to_msgzc(uarg);
+ bytelen = uarg_zc->bytelen + size;
+ if (uarg_zc->len == USHRT_MAX - 1 || bytelen > byte_limit) {
/* TCP can create new skb to attach new uarg */
if (sk->sk_type == SOCK_STREAM)
goto new_alloc;
@@ -1251,11 +1253,11 @@ struct ubuf_info *msg_zerocopy_realloc(struct sock *sk, size_t size,
}

next = (u32)atomic_read(&sk->sk_zckey);
- if ((u32)(uarg->id + uarg->len) == next) {
- if (mm_account_pinned_pages(&uarg->mmp, size))
+ if ((u32)(uarg_zc->id + uarg_zc->len) == next) {
+ if (mm_account_pinned_pages(&uarg_zc->mmp, size))
return NULL;
- uarg->len++;
- uarg->bytelen = bytelen;
+ uarg_zc->len++;
+ uarg_zc->bytelen = bytelen;
atomic_set(&sk->sk_zckey, ++next);

/* no extra ref when appending to datagram (MSG_MORE) */
@@ -1291,7 +1293,7 @@ static bool skb_zerocopy_notify_extend(struct sk_buff *skb, u32 lo, u16 len)
return true;
}

-static void __msg_zerocopy_callback(struct ubuf_info *uarg)
+static void __msg_zerocopy_callback(struct ubuf_info_msgzc *uarg)
{
struct sk_buff *tail, *skb = skb_from_uarg(uarg);
struct sock_exterr_skb *serr;
@@ -1344,19 +1346,21 @@ static void __msg_zerocopy_callback(struct ubuf_info *uarg)
void msg_zerocopy_callback(struct sk_buff *skb, struct ubuf_info *uarg,
bool success)
{
- uarg->zerocopy = uarg->zerocopy & success;
+ struct ubuf_info_msgzc *uarg_zc = uarg_to_msgzc(uarg);
+
+ uarg_zc->zerocopy = uarg_zc->zerocopy & success;

if (refcount_dec_and_test(&uarg->refcnt))
- __msg_zerocopy_callback(uarg);
+ __msg_zerocopy_callback(uarg_zc);
}
EXPORT_SYMBOL_GPL(msg_zerocopy_callback);

void msg_zerocopy_put_abort(struct ubuf_info *uarg, bool have_uref)
{
- struct sock *sk = skb_from_uarg(uarg)->sk;
+ struct sock *sk = skb_from_uarg(uarg_to_msgzc(uarg))->sk;

atomic_dec(&sk->sk_zckey);
- uarg->len--;
+ uarg_to_msgzc(uarg)->len--;

if (have_uref)
msg_zerocopy_callback(NULL, uarg, true);
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index 8201cd423ff9..1ae83ad629b2 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -1043,7 +1043,7 @@ static int __ip_append_data(struct sock *sk,
paged = true;
zc = true;
} else {
- uarg->zerocopy = 0;
+ uarg_to_msgzc(uarg)->zerocopy = 0;
skb_zcopy_set(skb, uarg, &extra_uref);
}
}
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 829beee3fa32..34fd52c07534 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -1239,7 +1239,7 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size)
}
zc = sk->sk_route_caps & NETIF_F_SG;
if (!zc)
- uarg->zerocopy = 0;
+ uarg_to_msgzc(uarg)->zerocopy = 0;
}
}

diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index e3dbb32b70dc..2f279cbd2648 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -1569,7 +1569,7 @@ static int __ip6_append_data(struct sock *sk,
paged = true;
zc = true;
} else {
- uarg->zerocopy = 0;
+ uarg_to_msgzc(uarg)->zerocopy = 0;
skb_zcopy_set(skb, uarg, &extra_uref);
}
}
--
2.37.2

2022-09-23 16:58:17

by Pavel Begunkov

[permalink] [raw]
Subject: [PATCH net-next 1/4] net: introduce struct ubuf_info_msgzc

We're going to split struct ubuf_info and leave there only
mandatory fields. Users are free to extend it. Add struct
ubuf_info_msgzc, which will be an extended version for MSG_ZEROCOPY and
some other users. It duplicates of struct ubuf_info for now and will be
removed in a couple of patches.

Signed-off-by: Pavel Begunkov <[email protected]>
---
include/linux/skbuff.h | 21 +++++++++++++++++++++
1 file changed, 21 insertions(+)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index f15d5b62539b..fd7dcb977fdf 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -554,7 +554,28 @@ struct ubuf_info {
} mmp;
};

+struct ubuf_info_msgzc {
+ struct ubuf_info ubuf;
+
+ union {
+ struct {
+ unsigned long desc;
+ void *ctx;
+ };
+ struct {
+ u32 id;
+ u16 len;
+ u16 zerocopy:1;
+ u32 bytelen;
+ };
+ };
+
+ struct mmpin mmp;
+};
+
#define skb_uarg(SKB) ((struct ubuf_info *)(skb_shinfo(SKB)->destructor_arg))
+#define uarg_to_msgzc(ubuf_ptr) container_of((ubuf_ptr), struct ubuf_info_msgzc, \
+ ubuf)

int mm_account_pinned_pages(struct mmpin *mmp, size_t size);
void mm_unaccount_pinned_pages(struct mmpin *mmp);
--
2.37.2

2022-09-23 17:12:03

by Pavel Begunkov

[permalink] [raw]
Subject: [PATCH net-next 3/4] vhost/net: use struct ubuf_info_msgzc

struct ubuf_info will be changed, use ubuf_info_msgzc instead.

Signed-off-by: Pavel Begunkov <[email protected]>
---
drivers/vhost/net.c | 15 ++++++++-------
1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 68e4ecd1cc0e..d7a04d573988 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -118,7 +118,7 @@ struct vhost_net_virtqueue {
/* Number of XDP frames batched */
int batched_xdp;
/* an array of userspace buffers info */
- struct ubuf_info *ubuf_info;
+ struct ubuf_info_msgzc *ubuf_info;
/* Reference counting for outstanding ubufs.
* Protected by vq mutex. Writers must also take device mutex. */
struct vhost_net_ubuf_ref *ubufs;
@@ -382,8 +382,9 @@ static void vhost_zerocopy_signal_used(struct vhost_net *net,
}

static void vhost_zerocopy_callback(struct sk_buff *skb,
- struct ubuf_info *ubuf, bool success)
+ struct ubuf_info *ubuf_base, bool success)
{
+ struct ubuf_info_msgzc *ubuf = uarg_to_msgzc(ubuf_base);
struct vhost_net_ubuf_ref *ubufs = ubuf->ctx;
struct vhost_virtqueue *vq = ubufs->vq;
int cnt;
@@ -871,7 +872,7 @@ static void handle_tx_zerocopy(struct vhost_net *net, struct socket *sock)
size_t len, total_len = 0;
int err;
struct vhost_net_ubuf_ref *ubufs;
- struct ubuf_info *ubuf;
+ struct ubuf_info_msgzc *ubuf;
bool zcopy_used;
int sent_pkts = 0;

@@ -907,14 +908,14 @@ static void handle_tx_zerocopy(struct vhost_net *net, struct socket *sock)
ubuf = nvq->ubuf_info + nvq->upend_idx;
vq->heads[nvq->upend_idx].id = cpu_to_vhost32(vq, head);
vq->heads[nvq->upend_idx].len = VHOST_DMA_IN_PROGRESS;
- ubuf->callback = vhost_zerocopy_callback;
ubuf->ctx = nvq->ubufs;
ubuf->desc = nvq->upend_idx;
- ubuf->flags = SKBFL_ZEROCOPY_FRAG;
- refcount_set(&ubuf->refcnt, 1);
+ ubuf->ubuf.callback = vhost_zerocopy_callback;
+ ubuf->ubuf.flags = SKBFL_ZEROCOPY_FRAG;
+ refcount_set(&ubuf->ubuf.refcnt, 1);
msg.msg_control = &ctl;
ctl.type = TUN_MSG_UBUF;
- ctl.ptr = ubuf;
+ ctl.ptr = &ubuf->ubuf;
msg.msg_controllen = sizeof(ctl);
ubufs = nvq->ubufs;
atomic_inc(&ubufs->refcount);
--
2.37.2

2022-09-27 14:14:40

by Paolo Abeni

[permalink] [raw]
Subject: Re: [PATCH net-next 0/4] shrink struct ubuf_info

Hello,

On Fri, 2022-09-23 at 17:39 +0100, Pavel Begunkov wrote:
> struct ubuf_info is large but not all fields are needed for all
> cases. We have limited space in io_uring for it and large ubuf_info
> prevents some struct embedding, even though we use only a subset
> of the fields. It's also not very clean trying to use this typeless
> extra space.
>
> Shrink struct ubuf_info to only necessary fields used in generic paths,
> namely ->callback, ->refcnt and ->flags, which take only 16 bytes. And
> make MSG_ZEROCOPY and some other users to embed it into a larger struct
> ubuf_info_msgzc mimicking the former ubuf_info.
>
> Note, xen/vhost may also have some cleaning on top by creating
> new structs containing ubuf_info but with proper types.

That sounds a bit scaring to me. If I read correctly, every uarg user
should check 'uarg->callback == msg_zerocopy_callback' before accessing
any 'extend' fields. AFAICS the current code sometimes don't do the
explicit test because the condition is somewhat implied, which in turn
is quite hard to track. 

clearing uarg->zerocopy for the 'wrong' uarg was armless and undetected
before this series, and after will trigger an oops..

There is some noise due to uarg -> uarg_zc renaming which make the
series harder to review. Have you considered instead keeping the old
name and introducing a smaller 'struct ubuf_info_common'? the overall
code should be mostly the same, but it will avoid the above mentioned
noise.

Thanks!

Paolo

2022-09-27 14:53:42

by Pavel Begunkov

[permalink] [raw]
Subject: Re: [PATCH net-next 0/4] shrink struct ubuf_info

Hello Paolo,

On 9/27/22 14:49, Paolo Abeni wrote:
> Hello,
>
> On Fri, 2022-09-23 at 17:39 +0100, Pavel Begunkov wrote:
>> struct ubuf_info is large but not all fields are needed for all
>> cases. We have limited space in io_uring for it and large ubuf_info
>> prevents some struct embedding, even though we use only a subset
>> of the fields. It's also not very clean trying to use this typeless
>> extra space.
>>
>> Shrink struct ubuf_info to only necessary fields used in generic paths,
>> namely ->callback, ->refcnt and ->flags, which take only 16 bytes. And
>> make MSG_ZEROCOPY and some other users to embed it into a larger struct
>> ubuf_info_msgzc mimicking the former ubuf_info.
>>
>> Note, xen/vhost may also have some cleaning on top by creating
>> new structs containing ubuf_info but with proper types.
>
> That sounds a bit scaring to me. If I read correctly, every uarg user
> should check 'uarg->callback == msg_zerocopy_callback' before accessing
> any 'extend' fields.

Providers of ubuf_info access those fields via callbacks and so already
know the actual structure used. The net core, on the opposite, should
keep it encapsulated and not touch them at all.

The series lists all places where we use extended fields just on the
merit of stripping the structure of those fields and successfully
building it. The only user in net/ipv{4,6}/* is MSG_ZEROCOPY, which
again uses callbacks.

Sounds like the right direction for me. There is a couple of
places where it might get type safer, i.e. adding types instead
of void* in for struct tun_msg_ctl and getting rid of one macro
hiding types in xen. But seems more like TODO for later.

> AFAICS the current code sometimes don't do the
> explicit test because the condition is somewhat implied, which in turn
> is quite hard to track.
>
> clearing uarg->zerocopy for the 'wrong' uarg was armless and undetected
> before this series, and after will trigger an oops..

And now we don't have this field at all to access, considering that
nobody blindly casts it.

> There is some noise due to uarg -> uarg_zc renaming which make the
> series harder to review. Have you considered instead keeping the old
> name and introducing a smaller 'struct ubuf_info_common'? the overall
> code should be mostly the same, but it will avoid the above mentioned
> noise.

I don't think there will be less noise this way, but let me try
and see if I can get rid of some churn.

--
Pavel Begunkov

2022-09-27 17:33:43

by Pavel Begunkov

[permalink] [raw]
Subject: Re: [PATCH net-next 0/4] shrink struct ubuf_info

On 9/27/22 15:28, Pavel Begunkov wrote:
> Hello Paolo,
>
> On 9/27/22 14:49, Paolo Abeni wrote:
>> Hello,
>>
>> On Fri, 2022-09-23 at 17:39 +0100, Pavel Begunkov wrote:
>>> struct ubuf_info is large but not all fields are needed for all
>>> cases. We have limited space in io_uring for it and large ubuf_info
>>> prevents some struct embedding, even though we use only a subset
>>> of the fields. It's also not very clean trying to use this typeless
>>> extra space.
>>>
>>> Shrink struct ubuf_info to only necessary fields used in generic paths,
>>> namely ->callback, ->refcnt and ->flags, which take only 16 bytes. And
>>> make MSG_ZEROCOPY and some other users to embed it into a larger struct
>>> ubuf_info_msgzc mimicking the former ubuf_info.
>>>
>>> Note, xen/vhost may also have some cleaning on top by creating
>>> new structs containing ubuf_info but with proper types.
>>
>> That sounds a bit scaring to me. If I read correctly, every uarg user
>> should check 'uarg->callback == msg_zerocopy_callback' before accessing
>> any 'extend' fields.
>
> Providers of ubuf_info access those fields via callbacks and so already
> know the actual structure used. The net core, on the opposite, should
> keep it encapsulated and not touch them at all.
>
> The series lists all places where we use extended fields just on the
> merit of stripping the structure of those fields and successfully
> building it. The only user in net/ipv{4,6}/* is MSG_ZEROCOPY, which
> again uses callbacks.
>
> Sounds like the right direction for me. There is a couple of
> places where it might get type safer, i.e. adding types instead
> of void* in for struct tun_msg_ctl and getting rid of one macro
> hiding types in xen. But seems more like TODO for later.
>
>> AFAICS the current code sometimes don't do the
>> explicit test because the condition is somewhat implied, which in turn
>> is quite hard to track.
>>
>> clearing uarg->zerocopy for the 'wrong' uarg was armless and undetected
>> before this series, and after will trigger an oops..
>
> And now we don't have this field at all to access, considering that
> nobody blindly casts it.
>
>> There is some noise due to uarg -> uarg_zc renaming which make the
>> series harder to review. Have you considered instead keeping the old
>> name and introducing a smaller 'struct ubuf_info_common'? the overall
>> code should be mostly the same, but it will avoid the above mentioned
>> noise.
>
> I don't think there will be less noise this way, but let me try
> and see if I can get rid of some churn.

It doesn't look any better for me

TL;DR; This series converts only 3 users: tap, xen and MSG_ZEROCOPY
and doesn't touch core code. If we do ubuf_info_common though I'd need
to convert lots of places in skbuff.c and multiple places across
tcp/udp, which is much worse. And then I'd still need to touch all
users to do ubuf_info -> ubuf_info_common conversion and all in a
single commit to not break build.

If it's about naming, I can add a tree-wide renaming patch on top.

Paolo, I'd appreciate if you let know whether you're fine with it
or not, I don't want the series to get stuck. For bug concerns,
all places touching those optional fields are converted to
ubuf_info_msgzc, and I wouldn't say 4/4 is so bad.

--
Pavel Begunkov

2022-09-27 18:17:37

by Paolo Abeni

[permalink] [raw]
Subject: Re: [PATCH net-next 0/4] shrink struct ubuf_info

On Tue, 2022-09-27 at 18:16 +0100, Pavel Begunkov wrote:
> On 9/27/22 15:28, Pavel Begunkov wrote:
> > Hello Paolo,
> >
> > On 9/27/22 14:49, Paolo Abeni wrote:
> > > Hello,
> > >
> > > On Fri, 2022-09-23 at 17:39 +0100, Pavel Begunkov wrote:
> > > > struct ubuf_info is large but not all fields are needed for all
> > > > cases. We have limited space in io_uring for it and large ubuf_info
> > > > prevents some struct embedding, even though we use only a subset
> > > > of the fields. It's also not very clean trying to use this typeless
> > > > extra space.
> > > >
> > > > Shrink struct ubuf_info to only necessary fields used in generic paths,
> > > > namely ->callback, ->refcnt and ->flags, which take only 16 bytes. And
> > > > make MSG_ZEROCOPY and some other users to embed it into a larger struct
> > > > ubuf_info_msgzc mimicking the former ubuf_info.
> > > >
> > > > Note, xen/vhost may also have some cleaning on top by creating
> > > > new structs containing ubuf_info but with proper types.
> > >
> > > That sounds a bit scaring to me. If I read correctly, every uarg user
> > > should check 'uarg->callback == msg_zerocopy_callback' before accessing
> > > any 'extend' fields.
> >
> > Providers of ubuf_info access those fields via callbacks and so already
> > know the actual structure used. The net core, on the opposite, should
> > keep it encapsulated and not touch them at all.
> >
> > The series lists all places where we use extended fields just on the
> > merit of stripping the structure of those fields and successfully
> > building it. The only user in net/ipv{4,6}/* is MSG_ZEROCOPY, which
> > again uses callbacks.
> >
> > Sounds like the right direction for me. There is a couple of
> > places where it might get type safer, i.e. adding types instead
> > of void* in for struct tun_msg_ctl and getting rid of one macro
> > hiding types in xen. But seems more like TODO for later.
> >
> > > AFAICS the current code sometimes don't do the
> > > explicit test because the condition is somewhat implied, which in turn
> > > is quite hard to track.
> > >
> > > clearing uarg->zerocopy for the 'wrong' uarg was armless and undetected
> > > before this series, and after will trigger an oops..
> >
> > And now we don't have this field at all to access, considering that
> > nobody blindly casts it.
> >
> > > There is some noise due to uarg -> uarg_zc renaming which make the
> > > series harder to review. Have you considered instead keeping the old
> > > name and introducing a smaller 'struct ubuf_info_common'? the overall
> > > code should be mostly the same, but it will avoid the above mentioned
> > > noise.
> >
> > I don't think there will be less noise this way, but let me try
> > and see if I can get rid of some churn.
>
> It doesn't look any better for me
>
> TL;DR; This series converts only 3 users: tap, xen and MSG_ZEROCOPY
> and doesn't touch core code. If we do ubuf_info_common though I'd need
> to convert lots of places in skbuff.c and multiple places across
> tcp/udp, which is much worse. 

Uhmm... I underlook the fact we must preserve the current accessors for
the common fields.

I guess something like the following could do (completely untested,
hopefully should illustrate the idea):

struct ubuf_info {
struct_group_tagged(ubuf_info_common, common,
void (*callback)(struct sk_buff *, struct ubuf_info *,
bool zerocopy_success);
refcount_t refcnt;
u8 flags;
);

union {
struct {
unsigned long desc;
void *ctx;
};
struct {
u32 id;
u16 len;
u16 zerocopy:1;
u32 bytelen;
};
};

struct mmpin {
struct user_struct *user;
unsigned int num_pg;
} mmp;
};

Then you should be able to:
- access ubuf_info->callback, 
- access the same field via ubuf_info->common.callback
- declare variables as 'struct ubuf_info_commom' with appropriate
contents.

WDYT?

Thanks,

Paolo


2022-09-27 19:17:03

by Pavel Begunkov

[permalink] [raw]
Subject: Re: [PATCH net-next 0/4] shrink struct ubuf_info

On 9/27/22 18:56, Paolo Abeni wrote:
> On Tue, 2022-09-27 at 18:16 +0100, Pavel Begunkov wrote:
>> On 9/27/22 15:28, Pavel Begunkov wrote:
>>> Hello Paolo,
>>>
>>> On 9/27/22 14:49, Paolo Abeni wrote:
>>>> Hello,
>>>>
>>>> On Fri, 2022-09-23 at 17:39 +0100, Pavel Begunkov wrote:
>>>>> struct ubuf_info is large but not all fields are needed for all
>>>>> cases. We have limited space in io_uring for it and large ubuf_info
>>>>> prevents some struct embedding, even though we use only a subset
>>>>> of the fields. It's also not very clean trying to use this typeless
>>>>> extra space.
>>>>>
>>>>> Shrink struct ubuf_info to only necessary fields used in generic paths,
>>>>> namely ->callback, ->refcnt and ->flags, which take only 16 bytes. And
>>>>> make MSG_ZEROCOPY and some other users to embed it into a larger struct
>>>>> ubuf_info_msgzc mimicking the former ubuf_info.
>>>>>
>>>>> Note, xen/vhost may also have some cleaning on top by creating
>>>>> new structs containing ubuf_info but with proper types.
>>>>
>>>> That sounds a bit scaring to me. If I read correctly, every uarg user
>>>> should check 'uarg->callback == msg_zerocopy_callback' before accessing
>>>> any 'extend' fields.
>>>
>>> Providers of ubuf_info access those fields via callbacks and so already
>>> know the actual structure used. The net core, on the opposite, should
>>> keep it encapsulated and not touch them at all.
>>>
>>> The series lists all places where we use extended fields just on the
>>> merit of stripping the structure of those fields and successfully
>>> building it. The only user in net/ipv{4,6}/* is MSG_ZEROCOPY, which
>>> again uses callbacks.
>>>
>>> Sounds like the right direction for me. There is a couple of
>>> places where it might get type safer, i.e. adding types instead
>>> of void* in for struct tun_msg_ctl and getting rid of one macro
>>> hiding types in xen. But seems more like TODO for later.
>>>
>>>> AFAICS the current code sometimes don't do the
>>>> explicit test because the condition is somewhat implied, which in turn
>>>> is quite hard to track.
>>>>
>>>> clearing uarg->zerocopy for the 'wrong' uarg was armless and undetected
>>>> before this series, and after will trigger an oops..
>>>
>>> And now we don't have this field at all to access, considering that
>>> nobody blindly casts it.
>>>
>>>> There is some noise due to uarg -> uarg_zc renaming which make the
>>>> series harder to review. Have you considered instead keeping the old
>>>> name and introducing a smaller 'struct ubuf_info_common'? the overall
>>>> code should be mostly the same, but it will avoid the above mentioned
>>>> noise.
>>>
>>> I don't think there will be less noise this way, but let me try
>>> and see if I can get rid of some churn.
>>
>> It doesn't look any better for me
>>
>> TL;DR; This series converts only 3 users: tap, xen and MSG_ZEROCOPY
>> and doesn't touch core code. If we do ubuf_info_common though I'd need
>> to convert lots of places in skbuff.c and multiple places across
>> tcp/udp, which is much worse.
>
> Uhmm... I underlook the fact we must preserve the current accessors for
> the common fields.
>
> I guess something like the following could do (completely untested,
> hopefully should illustrate the idea):
>
> struct ubuf_info {
> struct_group_tagged(ubuf_info_common, common,
> void (*callback)(struct sk_buff *, struct ubuf_info *,
> bool zerocopy_success);
> refcount_t refcnt;
> u8 flags;
> );
>
> union {
> struct {
> unsigned long desc;
> void *ctx;
> };
> struct {
> u32 id;
> u16 len;
> u16 zerocopy:1;
> u32 bytelen;
> };
> };
>
> struct mmpin {
> struct user_struct *user;
> unsigned int num_pg;
> } mmp;
> };
>
> Then you should be able to:
> - access ubuf_info->callback,
> - access the same field via ubuf_info->common.callback
> - declare variables as 'struct ubuf_info_commom' with appropriate
> contents.
>
> WDYT?

Interesting, I didn't think about struct_group, this would
let to split patches better and would limit non-core changes.
But if the plan is to convert the core helpers to
ubuf_info_common, than I think it's still messier than changing
ubuf providers only.

I can do the exercise, but I don't really see what is the goal.
Let me ask this, if we forget for a second how diffs look,
do you care about which pair is going to be in the end?
ubuf_info_common/ubuf_info vs ubuf_info/ubuf_info_msgzc?
Are there you concerned about naming or is there more to it?

--
Pavel Begunkov

2022-09-27 20:47:26

by Paolo Abeni

[permalink] [raw]
Subject: Re: [PATCH net-next 0/4] shrink struct ubuf_info

On Tue, 2022-09-27 at 19:48 +0100, Pavel Begunkov wrote:
> On 9/27/22 18:56, Paolo Abeni wrote:
> > On Tue, 2022-09-27 at 18:16 +0100, Pavel Begunkov wrote:
> > > On 9/27/22 15:28, Pavel Begunkov wrote:
> > > > Hello Paolo,
> > > >
> > > > On 9/27/22 14:49, Paolo Abeni wrote:
> > > > > Hello,
> > > > >
> > > > > On Fri, 2022-09-23 at 17:39 +0100, Pavel Begunkov wrote:
> > > > > > struct ubuf_info is large but not all fields are needed for all
> > > > > > cases. We have limited space in io_uring for it and large ubuf_info
> > > > > > prevents some struct embedding, even though we use only a subset
> > > > > > of the fields. It's also not very clean trying to use this typeless
> > > > > > extra space.
> > > > > >
> > > > > > Shrink struct ubuf_info to only necessary fields used in generic paths,
> > > > > > namely ->callback, ->refcnt and ->flags, which take only 16 bytes. And
> > > > > > make MSG_ZEROCOPY and some other users to embed it into a larger struct
> > > > > > ubuf_info_msgzc mimicking the former ubuf_info.
> > > > > >
> > > > > > Note, xen/vhost may also have some cleaning on top by creating
> > > > > > new structs containing ubuf_info but with proper types.
> > > > >
> > > > > That sounds a bit scaring to me. If I read correctly, every uarg user
> > > > > should check 'uarg->callback == msg_zerocopy_callback' before accessing
> > > > > any 'extend' fields.
> > > >
> > > > Providers of ubuf_info access those fields via callbacks and so already
> > > > know the actual structure used. The net core, on the opposite, should
> > > > keep it encapsulated and not touch them at all.
> > > >
> > > > The series lists all places where we use extended fields just on the
> > > > merit of stripping the structure of those fields and successfully
> > > > building it. The only user in net/ipv{4,6}/* is MSG_ZEROCOPY, which
> > > > again uses callbacks.
> > > >
> > > > Sounds like the right direction for me. There is a couple of
> > > > places where it might get type safer, i.e. adding types instead
> > > > of void* in for struct tun_msg_ctl and getting rid of one macro
> > > > hiding types in xen. But seems more like TODO for later.
> > > >
> > > > > AFAICS the current code sometimes don't do the
> > > > > explicit test because the condition is somewhat implied, which in turn
> > > > > is quite hard to track.
> > > > >
> > > > > clearing uarg->zerocopy for the 'wrong' uarg was armless and undetected
> > > > > before this series, and after will trigger an oops..
> > > >
> > > > And now we don't have this field at all to access, considering that
> > > > nobody blindly casts it.
> > > >
> > > > > There is some noise due to uarg -> uarg_zc renaming which make the
> > > > > series harder to review. Have you considered instead keeping the old
> > > > > name and introducing a smaller 'struct ubuf_info_common'? the overall
> > > > > code should be mostly the same, but it will avoid the above mentioned
> > > > > noise.
> > > >
> > > > I don't think there will be less noise this way, but let me try
> > > > and see if I can get rid of some churn.
> > >
> > > It doesn't look any better for me
> > >
> > > TL;DR; This series converts only 3 users: tap, xen and MSG_ZEROCOPY
> > > and doesn't touch core code. If we do ubuf_info_common though I'd need
> > > to convert lots of places in skbuff.c and multiple places across
> > > tcp/udp, which is much worse.
> >
> > Uhmm... I underlook the fact we must preserve the current accessors for
> > the common fields.
> >
> > I guess something like the following could do (completely untested,
> > hopefully should illustrate the idea):
> >
> > struct ubuf_info {
> > struct_group_tagged(ubuf_info_common, common,
> > void (*callback)(struct sk_buff *, struct ubuf_info *,
> > bool zerocopy_success);
> > refcount_t refcnt;
> > u8 flags;
> > );
> >
> > union {
> > struct {
> > unsigned long desc;
> > void *ctx;
> > };
> > struct {
> > u32 id;
> > u16 len;
> > u16 zerocopy:1;
> > u32 bytelen;
> > };
> > };
> >
> > struct mmpin {
> > struct user_struct *user;
> > unsigned int num_pg;
> > } mmp;
> > };
> >
> > Then you should be able to:
> > - access ubuf_info->callback,
> > - access the same field via ubuf_info->common.callback
> > - declare variables as 'struct ubuf_info_commom' with appropriate
> > contents.
> >
> > WDYT?
>
> Interesting, I didn't think about struct_group, this would
> let to split patches better and would limit non-core changes.
> But if the plan is to convert the core helpers to
> ubuf_info_common, than I think it's still messier than changing
> ubuf providers only.
>
> I can do the exercise, but I don't really see what is the goal.
> Let me ask this, if we forget for a second how diffs look,
> do you care about which pair is going to be in the end?

Uhm... I proposed this initially with the goal of remove non fuctional
changes from a patch that was hard to digest for me (4/4). So it's
about diffstat to me ;)

On the flip side the change suggested would probably not be as
straighforward as I would hope for.

> ubuf_info_common/ubuf_info vs ubuf_info/ubuf_info_msgzc?

The specific names used are not much relevant.

> Are there you concerned about naming or is there more to it?

I feel like this series is potentially dangerous, but I could not spot
bugs into the code. I would have felt more relaxed eariler in the devel
cycle.

Cheers,

Paolo

2022-09-27 20:48:05

by Pavel Begunkov

[permalink] [raw]
Subject: Re: [PATCH net-next 0/4] shrink struct ubuf_info

On 9/27/22 20:59, Paolo Abeni wrote:
> On Tue, 2022-09-27 at 19:48 +0100, Pavel Begunkov wrote:
>> On 9/27/22 18:56, Paolo Abeni wrote:
>>> On Tue, 2022-09-27 at 18:16 +0100, Pavel Begunkov wrote:
>>>> On 9/27/22 15:28, Pavel Begunkov wrote:
>>>>> Hello Paolo,
>>>>>
>>>>> On 9/27/22 14:49, Paolo Abeni wrote:
>>>>>> Hello,
>>>>>>
>>>>>> On Fri, 2022-09-23 at 17:39 +0100, Pavel Begunkov wrote:
>>>>>>> struct ubuf_info is large but not all fields are needed for all
>>>>>>> cases. We have limited space in io_uring for it and large ubuf_info
>>>>>>> prevents some struct embedding, even though we use only a subset
>>>>>>> of the fields. It's also not very clean trying to use this typeless
>>>>>>> extra space.
>>>>>>>
>>>>>>> Shrink struct ubuf_info to only necessary fields used in generic paths,
>>>>>>> namely ->callback, ->refcnt and ->flags, which take only 16 bytes. And
>>>>>>> make MSG_ZEROCOPY and some other users to embed it into a larger struct
>>>>>>> ubuf_info_msgzc mimicking the former ubuf_info.
>>>>>>>
>>>>>>> Note, xen/vhost may also have some cleaning on top by creating
>>>>>>> new structs containing ubuf_info but with proper types.
>>>>>>
>>>>>> That sounds a bit scaring to me. If I read correctly, every uarg user
>>>>>> should check 'uarg->callback == msg_zerocopy_callback' before accessing
>>>>>> any 'extend' fields.
>>>>>
>>>>> Providers of ubuf_info access those fields via callbacks and so already
>>>>> know the actual structure used. The net core, on the opposite, should
>>>>> keep it encapsulated and not touch them at all.
>>>>>
>>>>> The series lists all places where we use extended fields just on the
>>>>> merit of stripping the structure of those fields and successfully
>>>>> building it. The only user in net/ipv{4,6}/* is MSG_ZEROCOPY, which
>>>>> again uses callbacks.
>>>>>
>>>>> Sounds like the right direction for me. There is a couple of
>>>>> places where it might get type safer, i.e. adding types instead
>>>>> of void* in for struct tun_msg_ctl and getting rid of one macro
>>>>> hiding types in xen. But seems more like TODO for later.
>>>>>
>>>>>> AFAICS the current code sometimes don't do the
>>>>>> explicit test because the condition is somewhat implied, which in turn
>>>>>> is quite hard to track.
>>>>>>
>>>>>> clearing uarg->zerocopy for the 'wrong' uarg was armless and undetected
>>>>>> before this series, and after will trigger an oops..
>>>>>
>>>>> And now we don't have this field at all to access, considering that
>>>>> nobody blindly casts it.
>>>>>
>>>>>> There is some noise due to uarg -> uarg_zc renaming which make the
>>>>>> series harder to review. Have you considered instead keeping the old
>>>>>> name and introducing a smaller 'struct ubuf_info_common'? the overall
>>>>>> code should be mostly the same, but it will avoid the above mentioned
>>>>>> noise.
>>>>>
>>>>> I don't think there will be less noise this way, but let me try
>>>>> and see if I can get rid of some churn.
>>>>
>>>> It doesn't look any better for me
>>>>
>>>> TL;DR; This series converts only 3 users: tap, xen and MSG_ZEROCOPY
>>>> and doesn't touch core code. If we do ubuf_info_common though I'd need
>>>> to convert lots of places in skbuff.c and multiple places across
>>>> tcp/udp, which is much worse.
>>>
>>> Uhmm... I underlook the fact we must preserve the current accessors for
>>> the common fields.
>>>
>>> I guess something like the following could do (completely untested,
>>> hopefully should illustrate the idea):
>>>
>>> struct ubuf_info {
>>> struct_group_tagged(ubuf_info_common, common,
>>> void (*callback)(struct sk_buff *, struct ubuf_info *,
>>> bool zerocopy_success);
>>> refcount_t refcnt;
>>> u8 flags;
>>> );
>>>
>>> union {
>>> struct {
>>> unsigned long desc;
>>> void *ctx;
>>> };
>>> struct {
>>> u32 id;
>>> u16 len;
>>> u16 zerocopy:1;
>>> u32 bytelen;
>>> };
>>> };
>>>
>>> struct mmpin {
>>> struct user_struct *user;
>>> unsigned int num_pg;
>>> } mmp;
>>> };
>>>
>>> Then you should be able to:
>>> - access ubuf_info->callback,
>>> - access the same field via ubuf_info->common.callback
>>> - declare variables as 'struct ubuf_info_commom' with appropriate
>>> contents.
>>>
>>> WDYT?
>>
>> Interesting, I didn't think about struct_group, this would
>> let to split patches better and would limit non-core changes.
>> But if the plan is to convert the core helpers to
>> ubuf_info_common, than I think it's still messier than changing
>> ubuf providers only.
>>
>> I can do the exercise, but I don't really see what is the goal.
>> Let me ask this, if we forget for a second how diffs look,
>> do you care about which pair is going to be in the end?
>
> Uhm... I proposed this initially with the goal of remove non fuctional
> changes from a patch that was hard to digest for me (4/4). So it's
> about diffstat to me ;)

Ah, got it

> On the flip side the change suggested would probably not be as
> straighforward as I would hope for.
>
>> ubuf_info_common/ubuf_info vs ubuf_info/ubuf_info_msgzc?
>
> The specific names used are not much relevant.
>
>> Are there you concerned about naming or is there more to it?
>
> I feel like this series is potentially dangerous, but I could not spot
> bugs into the code. I would have felt more relaxed eariler in the devel
> cycle.

union {
struct {
unsigned long desc;
void *ctx;
};
struct {
u32 id;
u16 len;
u16 zerocopy:1;
u32 bytelen;
};
};


btw, nobody would frivolously change ->zerocopy anyway as it's
in a union. Even without the series we're absolutely screwed
if someone does that. If anything it adds a way to get rid of it:

1) Make vhost and xen use their own structures with right types.
2) kill unused struct {ctx, desc} for MSG_ZEROCOPY

--
Pavel Begunkov

2022-09-27 20:49:27

by Paolo Abeni

[permalink] [raw]
Subject: Re: [PATCH net-next 0/4] shrink struct ubuf_info

On Tue, 2022-09-27 at 21:17 +0100, Pavel Begunkov wrote:
> On 9/27/22 20:59, Paolo Abeni wrote:
> > On Tue, 2022-09-27 at 19:48 +0100, Pavel Begunkov wrote:
> > > On 9/27/22 18:56, Paolo Abeni wrote:
> > > > On Tue, 2022-09-27 at 18:16 +0100, Pavel Begunkov wrote:
> > > > > On 9/27/22 15:28, Pavel Begunkov wrote:
> > > > > > Hello Paolo,
> > > > > >
> > > > > > On 9/27/22 14:49, Paolo Abeni wrote:
> > > > > > > Hello,
> > > > > > >
> > > > > > > On Fri, 2022-09-23 at 17:39 +0100, Pavel Begunkov wrote:
> > > > > > > > struct ubuf_info is large but not all fields are needed for all
> > > > > > > > cases. We have limited space in io_uring for it and large ubuf_info
> > > > > > > > prevents some struct embedding, even though we use only a subset
> > > > > > > > of the fields. It's also not very clean trying to use this typeless
> > > > > > > > extra space.
> > > > > > > >
> > > > > > > > Shrink struct ubuf_info to only necessary fields used in generic paths,
> > > > > > > > namely ->callback, ->refcnt and ->flags, which take only 16 bytes. And
> > > > > > > > make MSG_ZEROCOPY and some other users to embed it into a larger struct
> > > > > > > > ubuf_info_msgzc mimicking the former ubuf_info.
> > > > > > > >
> > > > > > > > Note, xen/vhost may also have some cleaning on top by creating
> > > > > > > > new structs containing ubuf_info but with proper types.
> > > > > > >
> > > > > > > That sounds a bit scaring to me. If I read correctly, every uarg user
> > > > > > > should check 'uarg->callback == msg_zerocopy_callback' before accessing
> > > > > > > any 'extend' fields.
> > > > > >
> > > > > > Providers of ubuf_info access those fields via callbacks and so already
> > > > > > know the actual structure used. The net core, on the opposite, should
> > > > > > keep it encapsulated and not touch them at all.
> > > > > >
> > > > > > The series lists all places where we use extended fields just on the
> > > > > > merit of stripping the structure of those fields and successfully
> > > > > > building it. The only user in net/ipv{4,6}/* is MSG_ZEROCOPY, which
> > > > > > again uses callbacks.
> > > > > >
> > > > > > Sounds like the right direction for me. There is a couple of
> > > > > > places where it might get type safer, i.e. adding types instead
> > > > > > of void* in for struct tun_msg_ctl and getting rid of one macro
> > > > > > hiding types in xen. But seems more like TODO for later.
> > > > > >
> > > > > > > AFAICS the current code sometimes don't do the
> > > > > > > explicit test because the condition is somewhat implied, which in turn
> > > > > > > is quite hard to track.
> > > > > > >
> > > > > > > clearing uarg->zerocopy for the 'wrong' uarg was armless and undetected
> > > > > > > before this series, and after will trigger an oops..
> > > > > >
> > > > > > And now we don't have this field at all to access, considering that
> > > > > > nobody blindly casts it.
> > > > > >
> > > > > > > There is some noise due to uarg -> uarg_zc renaming which make the
> > > > > > > series harder to review. Have you considered instead keeping the old
> > > > > > > name and introducing a smaller 'struct ubuf_info_common'? the overall
> > > > > > > code should be mostly the same, but it will avoid the above mentioned
> > > > > > > noise.
> > > > > >
> > > > > > I don't think there will be less noise this way, but let me try
> > > > > > and see if I can get rid of some churn.
> > > > >
> > > > > It doesn't look any better for me
> > > > >
> > > > > TL;DR; This series converts only 3 users: tap, xen and MSG_ZEROCOPY
> > > > > and doesn't touch core code. If we do ubuf_info_common though I'd need
> > > > > to convert lots of places in skbuff.c and multiple places across
> > > > > tcp/udp, which is much worse.
> > > >
> > > > Uhmm... I underlook the fact we must preserve the current accessors for
> > > > the common fields.
> > > >
> > > > I guess something like the following could do (completely untested,
> > > > hopefully should illustrate the idea):
> > > >
> > > > struct ubuf_info {
> > > > struct_group_tagged(ubuf_info_common, common,
> > > > void (*callback)(struct sk_buff *, struct ubuf_info *,
> > > > bool zerocopy_success);
> > > > refcount_t refcnt;
> > > > u8 flags;
> > > > );
> > > >
> > > > union {
> > > > struct {
> > > > unsigned long desc;
> > > > void *ctx;
> > > > };
> > > > struct {
> > > > u32 id;
> > > > u16 len;
> > > > u16 zerocopy:1;
> > > > u32 bytelen;
> > > > };
> > > > };
> > > >
> > > > struct mmpin {
> > > > struct user_struct *user;
> > > > unsigned int num_pg;
> > > > } mmp;
> > > > };
> > > >
> > > > Then you should be able to:
> > > > - access ubuf_info->callback,
> > > > - access the same field via ubuf_info->common.callback
> > > > - declare variables as 'struct ubuf_info_commom' with appropriate
> > > > contents.
> > > >
> > > > WDYT?
> > >
> > > Interesting, I didn't think about struct_group, this would
> > > let to split patches better and would limit non-core changes.
> > > But if the plan is to convert the core helpers to
> > > ubuf_info_common, than I think it's still messier than changing
> > > ubuf providers only.
> > >
> > > I can do the exercise, but I don't really see what is the goal.
> > > Let me ask this, if we forget for a second how diffs look,
> > > do you care about which pair is going to be in the end?
> >
> > Uhm... I proposed this initially with the goal of remove non fuctional
> > changes from a patch that was hard to digest for me (4/4). So it's
> > about diffstat to me ;)
>
> Ah, got it
>
> > On the flip side the change suggested would probably not be as
> > straighforward as I would hope for.
> >
> > > ubuf_info_common/ubuf_info vs ubuf_info/ubuf_info_msgzc?
> >
> > The specific names used are not much relevant.
> >
> > > Are there you concerned about naming or is there more to it?
> >
> > I feel like this series is potentially dangerous, but I could not spot
> > bugs into the code. I would have felt more relaxed eariler in the devel
> > cycle.
>
> union {
> struct {
> unsigned long desc;
> void *ctx;
> };
> struct {
> u32 id;
> u16 len;
> u16 zerocopy:1;
> u32 bytelen;
> };
> };
>
>
> btw, nobody would frivolously change ->zerocopy anyway as it's
> in a union. Even without the series we're absolutely screwed
> if someone does that. If anything it adds a way to get rid of it:
>
> 1) Make vhost and xen use their own structures with right types.
> 2) kill unused struct {ctx, desc} for MSG_ZEROCOPY

Ok, the above sounds reasonable. Additionally I've spent the last
surviving neuron on my side to on this series, and it looks sane, so...

Acked-by: Paolo Abeni <[email protected]>

2022-09-27 21:25:32

by Pavel Begunkov

[permalink] [raw]
Subject: Re: [PATCH net-next 0/4] shrink struct ubuf_info

On 9/27/22 21:23, Paolo Abeni wrote:
> On Tue, 2022-09-27 at 21:17 +0100, Pavel Begunkov wrote:
>> On 9/27/22 20:59, Paolo Abeni wrote:
>>> On Tue, 2022-09-27 at 19:48 +0100, Pavel Begunkov wrote:
>>>> On 9/27/22 18:56, Paolo Abeni wrote:
>>>>> On Tue, 2022-09-27 at 18:16 +0100, Pavel Begunkov wrote:
>>>>>> On 9/27/22 15:28, Pavel Begunkov wrote:
>>>>>>> Hello Paolo,
>>>>>>>
>>>>>>> On 9/27/22 14:49, Paolo Abeni wrote:
>>>>>>>> Hello,
>>>>>>>>
>>>>>>>> On Fri, 2022-09-23 at 17:39 +0100, Pavel Begunkov wrote:
>>>>>>>>> struct ubuf_info is large but not all fields are needed for all
>>>>>>>>> cases. We have limited space in io_uring for it and large ubuf_info
>>>>>>>>> prevents some struct embedding, even though we use only a subset
>>>>>>>>> of the fields. It's also not very clean trying to use this typeless
>>>>>>>>> extra space.
>>>>>>>>>
>>>>>>>>> Shrink struct ubuf_info to only necessary fields used in generic paths,
>>>>>>>>> namely ->callback, ->refcnt and ->flags, which take only 16 bytes. And
>>>>>>>>> make MSG_ZEROCOPY and some other users to embed it into a larger struct
>>>>>>>>> ubuf_info_msgzc mimicking the former ubuf_info.
>>>>>>>>>
>>>>>>>>> Note, xen/vhost may also have some cleaning on top by creating
>>>>>>>>> new structs containing ubuf_info but with proper types.
>>>>>>>>
>>>>>>>> That sounds a bit scaring to me. If I read correctly, every uarg user
>>>>>>>> should check 'uarg->callback == msg_zerocopy_callback' before accessing
>>>>>>>> any 'extend' fields.
>>>>>>>
>>>>>>> Providers of ubuf_info access those fields via callbacks and so already
>>>>>>> know the actual structure used. The net core, on the opposite, should
>>>>>>> keep it encapsulated and not touch them at all.
>>>>>>>
>>>>>>> The series lists all places where we use extended fields just on the
>>>>>>> merit of stripping the structure of those fields and successfully
>>>>>>> building it. The only user in net/ipv{4,6}/* is MSG_ZEROCOPY, which
>>>>>>> again uses callbacks.
>>>>>>>
>>>>>>> Sounds like the right direction for me. There is a couple of
>>>>>>> places where it might get type safer, i.e. adding types instead
>>>>>>> of void* in for struct tun_msg_ctl and getting rid of one macro
>>>>>>> hiding types in xen. But seems more like TODO for later.
>>>>>>>
>>>>>>>> AFAICS the current code sometimes don't do the
>>>>>>>> explicit test because the condition is somewhat implied, which in turn
>>>>>>>> is quite hard to track.
>>>>>>>>
>>>>>>>> clearing uarg->zerocopy for the 'wrong' uarg was armless and undetected
>>>>>>>> before this series, and after will trigger an oops..
>>>>>>>
>>>>>>> And now we don't have this field at all to access, considering that
>>>>>>> nobody blindly casts it.
>>>>>>>
>>>>>>>> There is some noise due to uarg -> uarg_zc renaming which make the
>>>>>>>> series harder to review. Have you considered instead keeping the old
>>>>>>>> name and introducing a smaller 'struct ubuf_info_common'? the overall
>>>>>>>> code should be mostly the same, but it will avoid the above mentioned
>>>>>>>> noise.
>>>>>>>
>>>>>>> I don't think there will be less noise this way, but let me try
>>>>>>> and see if I can get rid of some churn.
>>>>>>
>>>>>> It doesn't look any better for me
>>>>>>
>>>>>> TL;DR; This series converts only 3 users: tap, xen and MSG_ZEROCOPY
>>>>>> and doesn't touch core code. If we do ubuf_info_common though I'd need
>>>>>> to convert lots of places in skbuff.c and multiple places across
>>>>>> tcp/udp, which is much worse.
>>>>>
>>>>> Uhmm... I underlook the fact we must preserve the current accessors for
>>>>> the common fields.
>>>>>
>>>>> I guess something like the following could do (completely untested,
>>>>> hopefully should illustrate the idea):
>>>>>
>>>>> struct ubuf_info {
>>>>> struct_group_tagged(ubuf_info_common, common,
>>>>> void (*callback)(struct sk_buff *, struct ubuf_info *,
>>>>> bool zerocopy_success);
>>>>> refcount_t refcnt;
>>>>> u8 flags;
>>>>> );
>>>>>
>>>>> union {
>>>>> struct {
>>>>> unsigned long desc;
>>>>> void *ctx;
>>>>> };
>>>>> struct {
>>>>> u32 id;
>>>>> u16 len;
>>>>> u16 zerocopy:1;
>>>>> u32 bytelen;
>>>>> };
>>>>> };
>>>>>
>>>>> struct mmpin {
>>>>> struct user_struct *user;
>>>>> unsigned int num_pg;
>>>>> } mmp;
>>>>> };
>>>>>
>>>>> Then you should be able to:
>>>>> - access ubuf_info->callback,
>>>>> - access the same field via ubuf_info->common.callback
>>>>> - declare variables as 'struct ubuf_info_commom' with appropriate
>>>>> contents.
>>>>>
>>>>> WDYT?
>>>>
>>>> Interesting, I didn't think about struct_group, this would
>>>> let to split patches better and would limit non-core changes.
>>>> But if the plan is to convert the core helpers to
>>>> ubuf_info_common, than I think it's still messier than changing
>>>> ubuf providers only.
>>>>
>>>> I can do the exercise, but I don't really see what is the goal.
>>>> Let me ask this, if we forget for a second how diffs look,
>>>> do you care about which pair is going to be in the end?
>>>
>>> Uhm... I proposed this initially with the goal of remove non fuctional
>>> changes from a patch that was hard to digest for me (4/4). So it's
>>> about diffstat to me ;)
>>
>> Ah, got it
>>
>>> On the flip side the change suggested would probably not be as
>>> straighforward as I would hope for.
>>>
>>>> ubuf_info_common/ubuf_info vs ubuf_info/ubuf_info_msgzc?
>>>
>>> The specific names used are not much relevant.
>>>
>>>> Are there you concerned about naming or is there more to it?
>>>
>>> I feel like this series is potentially dangerous, but I could not spot
>>> bugs into the code. I would have felt more relaxed eariler in the devel
>>> cycle.
>>
>> union {
>> struct {
>> unsigned long desc;
>> void *ctx;
>> };
>> struct {
>> u32 id;
>> u16 len;
>> u16 zerocopy:1;
>> u32 bytelen;
>> };
>> };
>>
>>
>> btw, nobody would frivolously change ->zerocopy anyway as it's
>> in a union. Even without the series we're absolutely screwed
>> if someone does that. If anything it adds a way to get rid of it:
>>
>> 1) Make vhost and xen use their own structures with right types.
>> 2) kill unused struct {ctx, desc} for MSG_ZEROCOPY
>
> Ok, the above sounds reasonable. Additionally I've spent the last
> surviving neuron on my side to on this series, and it looks sane, so...
>
> Acked-by: Paolo Abeni <[email protected]>

Great, thanks for taking a look!

--
Pavel Begunkov

2022-09-29 02:57:40

by patchwork-bot+netdevbpf

[permalink] [raw]
Subject: Re: [PATCH net-next 0/4] shrink struct ubuf_info

Hello:

This series was applied to netdev/net-next.git (master)
by Jakub Kicinski <[email protected]>:

On Fri, 23 Sep 2022 17:39:00 +0100 you wrote:
> struct ubuf_info is large but not all fields are needed for all
> cases. We have limited space in io_uring for it and large ubuf_info
> prevents some struct embedding, even though we use only a subset
> of the fields. It's also not very clean trying to use this typeless
> extra space.
>
> Shrink struct ubuf_info to only necessary fields used in generic paths,
> namely ->callback, ->refcnt and ->flags, which take only 16 bytes. And
> make MSG_ZEROCOPY and some other users to embed it into a larger struct
> ubuf_info_msgzc mimicking the former ubuf_info.
>
> [...]

Here is the summary with links:
- [net-next,1/4] net: introduce struct ubuf_info_msgzc
https://git.kernel.org/netdev/net-next/c/6eaab4dfdd30
- [net-next,2/4] xen/netback: use struct ubuf_info_msgzc
https://git.kernel.org/netdev/net-next/c/b63ca3e822e7
- [net-next,3/4] vhost/net: use struct ubuf_info_msgzc
https://git.kernel.org/netdev/net-next/c/dfff202be5ea
- [net-next,4/4] net: shrink struct ubuf_info
https://git.kernel.org/netdev/net-next/c/e7d2b510165f

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html