2023-12-20 21:48:43

by Mina Almasry

[permalink] [raw]
Subject: [PATCH net-next v3 0/3] Abstract page from net stack

Changes in v3:

- Replaced the struct netmem union with an opaque netmem_ref type.
- Added func docs to the netmem helpers and type.
- Renamed the skb_frag_t fields since it's no longer a bio_vec

-----------

Changes in v2:
- Reverted changes to the page_pool. The page pool now retains the same
API, so that we don't have to touch many existing drivers. The devmem
TCP series will include the changes to the page pool.

- Addressed comments.

This series is a prerequisite to the devmem TCP series. For a full
snapshot of the code which includes these changes, feel free to check:

https://github.com/mina/linux/commits/tcpdevmem-rfcv5/

-----------

Currently these components in the net stack use the struct page
directly:

1. Drivers.
2. Page pool.
3. skb_frag_t.

To add support for new (non struct page) memory types to the net stack, we
must first abstract the current memory type.

Originally the plan was to reuse struct page* for the new memory types,
and to set the LSB on the page* to indicate it's not really a page.
However, for safe compiler type checking we need to introduce a new type.

struct netmem is introduced to abstract the underlying memory type.
Currently it's a no-op abstraction that is always a struct page underneath.
In parallel there is an undergoing effort to add support for devmem to the
net stack:

https://lore.kernel.org/netdev/[email protected]/

Cc: Jason Gunthorpe <[email protected]>
Cc: Christian König <[email protected]>
Cc: Shakeel Butt <[email protected]>
Cc: Yunsheng Lin <[email protected]>
Cc: Willem de Bruijn <[email protected]>

Mina Almasry (3):
vsock/virtio: use skb_frag_*() helpers
net: introduce abstraction for network memory
net: add netmem_ref to skb_frag_t

include/linux/skbuff.h | 92 ++++++++++++++++++++++----------
include/net/netmem.h | 41 ++++++++++++++
net/core/skbuff.c | 22 +++++---
net/kcm/kcmsock.c | 10 +++-
net/vmw_vsock/virtio_transport.c | 6 +--
5 files changed, 133 insertions(+), 38 deletions(-)
create mode 100644 include/net/netmem.h

--
2.43.0.472.g3155946c3a-goog



2023-12-20 22:25:02

by Mina Almasry

[permalink] [raw]
Subject: [PATCH net-next v3 2/3] net: introduce abstraction for network memory

Add the netmem_ref type, an abstraction for network memory.

To add support for new memory types to the net stack, we must first
abstract the current memory type. Currently parts of the net stack
use struct page directly:

- page_pool
- drivers
- skb_frag_t

Originally the plan was to reuse struct page* for the new memory types,
and to set the LSB on the page* to indicate it's not really a page.
However, for compiler type checking we need to introduce a new type.

netmem_ref is introduced to abstract the underlying memory type. Currently
it's a no-op abstraction that is always a struct page underneath. In
parallel there is an undergoing effort to add support for devmem to the
net stack:

https://lore.kernel.org/netdev/[email protected]/

Signed-off-by: Mina Almasry <[email protected]>

---

v3:

- Modify struct netmem from a union of struct page + new types to an opaque
netmem_ref type. I went with:

+typedef void *__bitwise netmem_ref;

rather than this that Jakub recommended:

+typedef unsigned long __bitwise netmem_ref;

Because with the latter the compiler issues warnings to cast NULL to
netmem_ref. I hope that's ok.

- Add some function docs.

v2:

- Use container_of instead of a type cast (David).
---
include/net/netmem.h | 41 +++++++++++++++++++++++++++++++++++++++++
1 file changed, 41 insertions(+)
create mode 100644 include/net/netmem.h

diff --git a/include/net/netmem.h b/include/net/netmem.h
new file mode 100644
index 000000000000..edd977326203
--- /dev/null
+++ b/include/net/netmem.h
@@ -0,0 +1,41 @@
+/* SPDX-License-Identifier: GPL-2.0
+ *
+ * Network memory
+ *
+ * Author: Mina Almasry <[email protected]>
+ */
+
+#ifndef _NET_NETMEM_H
+#define _NET_NETMEM_H
+
+/**
+ * typedef netmem_ref - a nonexistent type marking a reference to generic
+ * network memory.
+ *
+ * A netmem_ref currently is always a reference to a struct page. This
+ * abstraction is introduced so support for new memory types can be added.
+ *
+ * Use the supplied helpers to obtain the underlying memory pointer and fields.
+ */
+typedef void *__bitwise netmem_ref;
+
+/* This conversion fails (returns NULL) if the netmem_ref is not struct page
+ * backed.
+ *
+ * Currently struct page is the only possible netmem, and this helper never
+ * fails.
+ */
+static inline struct page *netmem_to_page(netmem_ref netmem)
+{
+ return (struct page *)netmem;
+}
+
+/* Converting from page to netmem is always safe, because a page can always be
+ * a netmem.
+ */
+static inline netmem_ref page_to_netmem(struct page *page)
+{
+ return (netmem_ref)page;
+}
+
+#endif /* _NET_NETMEM_H */
--
2.43.0.472.g3155946c3a-goog


2023-12-20 22:27:35

by Mina Almasry

[permalink] [raw]
Subject: [PATCH net-next v3 1/3] vsock/virtio: use skb_frag_*() helpers

Minor fix for virtio: code wanting to access the fields inside an skb
frag should use the skb_frag_*() helpers, instead of accessing the
fields directly. This allows for extensions where the underlying
memory is not a page.

Signed-off-by: Mina Almasry <[email protected]>

---

v2:

- Also fix skb_frag_off() + skb_frag_size() (David)
- Did not apply the reviewed-by from Stefano since the patch changed
relatively much.

---
net/vmw_vsock/virtio_transport.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
index f495b9e5186b..1748268e0694 100644
--- a/net/vmw_vsock/virtio_transport.c
+++ b/net/vmw_vsock/virtio_transport.c
@@ -153,10 +153,10 @@ virtio_transport_send_pkt_work(struct work_struct *work)
* 'virt_to_phys()' later to fill the buffer descriptor.
* We don't touch memory at "virtual" address of this page.
*/
- va = page_to_virt(skb_frag->bv_page);
+ va = page_to_virt(skb_frag_page(skb_frag));
sg_init_one(sgs[out_sg],
- va + skb_frag->bv_offset,
- skb_frag->bv_len);
+ va + skb_frag_off(skb_frag),
+ skb_frag_size(skb_frag));
out_sg++;
}
}
--
2.43.0.472.g3155946c3a-goog


2023-12-21 17:19:08

by Willem de Bruijn

[permalink] [raw]
Subject: Re: [PATCH net-next v3 1/3] vsock/virtio: use skb_frag_*() helpers

Mina Almasry wrote:
> Minor fix for virtio: code wanting to access the fields inside an skb
> frag should use the skb_frag_*() helpers, instead of accessing the
> fields directly. This allows for extensions where the underlying
> memory is not a page.
>
> Signed-off-by: Mina Almasry <[email protected]>
>
> ---
>
> v2:
>
> - Also fix skb_frag_off() + skb_frag_size() (David)
> - Did not apply the reviewed-by from Stefano since the patch changed
> relatively much.
>
> ---
> net/vmw_vsock/virtio_transport.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
> index f495b9e5186b..1748268e0694 100644
> --- a/net/vmw_vsock/virtio_transport.c
> +++ b/net/vmw_vsock/virtio_transport.c
> @@ -153,10 +153,10 @@ virtio_transport_send_pkt_work(struct work_struct *work)
> * 'virt_to_phys()' later to fill the buffer descriptor.
> * We don't touch memory at "virtual" address of this page.
> */
> - va = page_to_virt(skb_frag->bv_page);
> + va = page_to_virt(skb_frag_page(skb_frag));
> sg_init_one(sgs[out_sg],
> - va + skb_frag->bv_offset,
> - skb_frag->bv_len);
> + va + skb_frag_off(skb_frag),
> + skb_frag_size(skb_frag));
> out_sg++;
> }
> }

If there are requests for further revision in the series, can send
this virtio cleanup on its own to get it off the stack.

> --
> 2.43.0.472.g3155946c3a-goog
>



2023-12-21 21:46:20

by Shakeel Butt

[permalink] [raw]
Subject: Re: [PATCH net-next v3 1/3] vsock/virtio: use skb_frag_*() helpers

On Wed, Dec 20, 2023 at 1:45 PM Mina Almasry <[email protected]> wrote:
>
> Minor fix for virtio: code wanting to access the fields inside an skb
> frag should use the skb_frag_*() helpers, instead of accessing the
> fields directly. This allows for extensions where the underlying
> memory is not a page.
>
> Signed-off-by: Mina Almasry <[email protected]>

Reviewed-by: Shakeel Butt <[email protected]>

2023-12-21 23:25:19

by Shakeel Butt

[permalink] [raw]
Subject: Re: [PATCH net-next v3 2/3] net: introduce abstraction for network memory

On Wed, Dec 20, 2023 at 01:45:01PM -0800, Mina Almasry wrote:
> Add the netmem_ref type, an abstraction for network memory.
>
> To add support for new memory types to the net stack, we must first
> abstract the current memory type. Currently parts of the net stack
> use struct page directly:
>
> - page_pool
> - drivers
> - skb_frag_t
>
> Originally the plan was to reuse struct page* for the new memory types,
> and to set the LSB on the page* to indicate it's not really a page.
> However, for compiler type checking we need to introduce a new type.
>
> netmem_ref is introduced to abstract the underlying memory type. Currently
> it's a no-op abstraction that is always a struct page underneath. In
> parallel there is an undergoing effort to add support for devmem to the
> net stack:
>
> https://lore.kernel.org/netdev/[email protected]/
>
> Signed-off-by: Mina Almasry <[email protected]>
>
> ---
>
> v3:
>
> - Modify struct netmem from a union of struct page + new types to an opaque
> netmem_ref type. I went with:
>
> +typedef void *__bitwise netmem_ref;
>
> rather than this that Jakub recommended:
>
> +typedef unsigned long __bitwise netmem_ref;
>
> Because with the latter the compiler issues warnings to cast NULL to
> netmem_ref. I hope that's ok.
>

Can you share what the warning was? You might just need __force
attribute. However you might need this __force a lot. I wonder if you
can just follow struct encoded_page example verbatim here.


2023-12-21 23:45:34

by Mina Almasry

[permalink] [raw]
Subject: Re: [PATCH net-next v3 2/3] net: introduce abstraction for network memory

On Thu, Dec 21, 2023 at 3:23 PM Shakeel Butt <[email protected]> wrote:
>
> On Wed, Dec 20, 2023 at 01:45:01PM -0800, Mina Almasry wrote:
> > Add the netmem_ref type, an abstraction for network memory.
> >
> > To add support for new memory types to the net stack, we must first
> > abstract the current memory type. Currently parts of the net stack
> > use struct page directly:
> >
> > - page_pool
> > - drivers
> > - skb_frag_t
> >
> > Originally the plan was to reuse struct page* for the new memory types,
> > and to set the LSB on the page* to indicate it's not really a page.
> > However, for compiler type checking we need to introduce a new type.
> >
> > netmem_ref is introduced to abstract the underlying memory type. Currently
> > it's a no-op abstraction that is always a struct page underneath. In
> > parallel there is an undergoing effort to add support for devmem to the
> > net stack:
> >
> > https://lore.kernel.org/netdev/[email protected]/
> >
> > Signed-off-by: Mina Almasry <[email protected]>
> >
> > ---
> >
> > v3:
> >
> > - Modify struct netmem from a union of struct page + new types to an opaque
> > netmem_ref type. I went with:
> >
> > +typedef void *__bitwise netmem_ref;
> >
> > rather than this that Jakub recommended:
> >
> > +typedef unsigned long __bitwise netmem_ref;
> >
> > Because with the latter the compiler issues warnings to cast NULL to
> > netmem_ref. I hope that's ok.
> >
>
> Can you share what the warning was? You might just need __force
> attribute. However you might need this __force a lot. I wonder if you
> can just follow struct encoded_page example verbatim here.
>

The warning is like so:

./include/net/page_pool/helpers.h: In function ‘page_pool_alloc’:
./include/linux/stddef.h:8:14: warning: returning ‘void *’ from a
function with return type ‘netmem_ref’ {aka ‘long unsigned int’} makes
integer from pointer without a cast [-Wint-conversion]
8 | #define NULL ((void *)0)
| ^
./include/net/page_pool/helpers.h:132:24: note: in expansion of macro
‘NULL’
132 | return NULL;
| ^~~~

And happens in all the code where:

netmem_ref func()
{
return NULL;
}

It's fixable by changing the return to `return (netmem_ref NULL);` or
`return 0;`, but I feel like netmem_ref should be some type which
allows a cast from NULL implicitly.

Also as you (and patchwork) noticed, __bitwise should not be used with
void*; it's only meant for integer types. Sorry I missed that in the
docs and was not running make C=2.

--
Thanks,
Mina

2024-01-02 10:00:39

by Stefano Garzarella

[permalink] [raw]
Subject: Re: [PATCH net-next v3 1/3] vsock/virtio: use skb_frag_*() helpers

On Wed, Dec 20, 2023 at 01:45:00PM -0800, Mina Almasry wrote:
>Minor fix for virtio: code wanting to access the fields inside an skb
>frag should use the skb_frag_*() helpers, instead of accessing the
>fields directly. This allows for extensions where the underlying
>memory is not a page.
>
>Signed-off-by: Mina Almasry <[email protected]>
>
>---
>
>v2:
>
>- Also fix skb_frag_off() + skb_frag_size() (David)
>- Did not apply the reviewed-by from Stefano since the patch changed
>relatively much.

Sorry for the delay, I was off.

LGTM!

Acked-by: Stefano Garzarella <[email protected]>

Possibly we can also send this patch alone if the series is still under
discussion because it's definitely an improvement to the current code.

Thanks,
Stefano

>
>---
> net/vmw_vsock/virtio_transport.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
>diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
>index f495b9e5186b..1748268e0694 100644
>--- a/net/vmw_vsock/virtio_transport.c
>+++ b/net/vmw_vsock/virtio_transport.c
>@@ -153,10 +153,10 @@ virtio_transport_send_pkt_work(struct work_struct *work)
> * 'virt_to_phys()' later to fill the buffer descriptor.
> * We don't touch memory at "virtual" address of this page.
> */
>- va = page_to_virt(skb_frag->bv_page);
>+ va = page_to_virt(skb_frag_page(skb_frag));
> sg_init_one(sgs[out_sg],
>- va + skb_frag->bv_offset,
>- skb_frag->bv_len);
>+ va + skb_frag_off(skb_frag),
>+ skb_frag_size(skb_frag));
> out_sg++;
> }
> }
>--
>2.43.0.472.g3155946c3a-goog
>


2024-01-04 21:45:21

by Jakub Kicinski

[permalink] [raw]
Subject: Re: [PATCH net-next v3 2/3] net: introduce abstraction for network memory

On Thu, 21 Dec 2023 15:44:22 -0800 Mina Almasry wrote:
> The warning is like so:
>
> ./include/net/page_pool/helpers.h: In function ‘page_pool_alloc’:
> ./include/linux/stddef.h:8:14: warning: returning ‘void *’ from a
> function with return type ‘netmem_ref’ {aka ‘long unsigned int’} makes
> integer from pointer without a cast [-Wint-conversion]
> 8 | #define NULL ((void *)0)
> | ^
> ./include/net/page_pool/helpers.h:132:24: note: in expansion of macro
> ‘NULL’
> 132 | return NULL;
> | ^~~~
>
> And happens in all the code where:
>
> netmem_ref func()
> {
> return NULL;
> }
>
> It's fixable by changing the return to `return (netmem_ref NULL);` or
> `return 0;`, but I feel like netmem_ref should be some type which
> allows a cast from NULL implicitly.

Why do you think we should be able to cast NULL implicitly?
netmem_ref is a handle, it could possibly be some form of
an ID in the future, rather than a pointer. Or have more low
bits stolen for specific use cases.

unsigned long, and returning 0 as "no handle" makes perfect sense to me.

Note that 0 is a special case, bitwise types are allowed to convert
to 0/bool and 0 is implicitly allowed to become a bitwise type.
This will pass without a warning:

typedef unsigned long __bitwise netmem_ref;

netmem_ref some_code(netmem_ref ref)
{
// direct test is fine
if (!ref)
// 0 "upgrades" without casts
return 0;
// 1 does not, we need __force
return (__force netmem_ref)1 | ref;
}

The __bitwise annotation will make catching people trying
to cast to struct page * trivial.

You seem to be trying hard to make struct netmem a thing.
Perhaps you have a reason I'm not getting?

2024-01-04 22:16:26

by Mina Almasry

[permalink] [raw]
Subject: Re: [PATCH net-next v3 2/3] net: introduce abstraction for network memory

On Thu, Jan 4, 2024 at 1:44 PM Jakub Kicinski <[email protected]> wrote:
>
> On Thu, 21 Dec 2023 15:44:22 -0800 Mina Almasry wrote:
> > The warning is like so:
> >
> > ./include/net/page_pool/helpers.h: In function ‘page_pool_alloc’:
> > ./include/linux/stddef.h:8:14: warning: returning ‘void *’ from a
> > function with return type ‘netmem_ref’ {aka ‘long unsigned int’} makes
> > integer from pointer without a cast [-Wint-conversion]
> > 8 | #define NULL ((void *)0)
> > | ^
> > ./include/net/page_pool/helpers.h:132:24: note: in expansion of macro
> > ‘NULL’
> > 132 | return NULL;
> > | ^~~~
> >
> > And happens in all the code where:
> >
> > netmem_ref func()
> > {
> > return NULL;
> > }
> >
> > It's fixable by changing the return to `return (netmem_ref NULL);` or
> > `return 0;`, but I feel like netmem_ref should be some type which
> > allows a cast from NULL implicitly.
>
> Why do you think we should be able to cast NULL implicitly?
> netmem_ref is a handle, it could possibly be some form of
> an ID in the future, rather than a pointer. Or have more low
> bits stolen for specific use cases.
>
> unsigned long, and returning 0 as "no handle" makes perfect sense to me.
>
> Note that 0 is a special case, bitwise types are allowed to convert
> to 0/bool and 0 is implicitly allowed to become a bitwise type.
> This will pass without a warning:
>
> typedef unsigned long __bitwise netmem_ref;
>
> netmem_ref some_code(netmem_ref ref)
> {
> // direct test is fine
> if (!ref)
> // 0 "upgrades" without casts
> return 0;
> // 1 does not, we need __force
> return (__force netmem_ref)1 | ref;
> }
>
> The __bitwise annotation will make catching people trying
> to cast to struct page * trivial.
>
> You seem to be trying hard to make struct netmem a thing.
> Perhaps you have a reason I'm not getting?

There are a number of functions that return struct page* today that I
convert to return struct netmem* later in the child devmem series, one
example is something like:

struct page *page_pool_alloc(...); // returns NULL on failure.

becomes:

struct netmem *page_pool_alloc(...); // also returns NULL on failure.

rather than,

netmem_ref page_pool_alloc(...); // returns 0 on failure.

I guess in my mind having NULL be castable to the new type makes it so
that I can avoid the additional code churn of converting a bunch of
`return NULL;` to `return 0;`, and maybe the transition from page
pointers to netmem pointers can be more easily done if they're both
compatible pointer types.

But that is not any huge blocker or critical point in my mind, I just
thought this approach is preferred. If conversion to unsigned long
makes more sense to you, I'll respin this like that and do the `NULL
-> 0` conversion everywhere as needed.

--
Thanks,
Mina

2024-01-10 17:54:25

by Shakeel Butt

[permalink] [raw]
Subject: Re: [PATCH net-next v3 2/3] net: introduce abstraction for network memory

On Thu, Jan 4, 2024 at 1:44 PM Jakub Kicinski <[email protected]> wrote:
>
[...]
>
> You seem to be trying hard to make struct netmem a thing.
> Perhaps you have a reason I'm not getting?

Mina already went with your suggestion and that is fine. To me, struct
netmem is more aesthetically aligned with the existing struct
encoded_page approach, but I don't have a strong opinion one way or
the other. However it seems like you have a stronger preference for
__bitwise approach. Is there a technical reason or just aesthetic?

2024-01-11 01:35:40

by Jakub Kicinski

[permalink] [raw]
Subject: Re: [PATCH net-next v3 2/3] net: introduce abstraction for network memory

On Wed, 10 Jan 2024 09:50:08 -0800 Shakeel Butt wrote:
> On Thu, Jan 4, 2024 at 1:44 PM Jakub Kicinski <[email protected]> wrote:
> > You seem to be trying hard to make struct netmem a thing.
> > Perhaps you have a reason I'm not getting?
>
> Mina already went with your suggestion and that is fine. To me, struct
> netmem is more aesthetically aligned with the existing struct
> encoded_page approach, but I don't have a strong opinion one way or
> the other. However it seems like you have a stronger preference for
> __bitwise approach. Is there a technical reason or just aesthetic?

Yes, right above the text you quoted:

The __bitwise annotation will make catching people trying
to cast to struct page * trivial.

https://lore.kernel.org/all/[email protected]/