On Wed, Feb 28, 2024 at 10:41:37PM +0200, Andy Shevchenko wrote:
> We have two new helpers struct_size_with_data() and struct_data_pointer()
> that we can utilize in alloc_netdev_mqs() and netdev_priv(). Do it so.
>
> Signed-off-by: Andy Shevchenko <[email protected]>
> ---
> include/linux/netdevice.h | 3 ++-
> net/core/dev.c | 10 +++++-----
> 2 files changed, 7 insertions(+), 6 deletions(-)
>
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index c41019f34179..d046dca18854 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -25,6 +25,7 @@
> #include <linux/bug.h>
> #include <linux/delay.h>
> #include <linux/atomic.h>
> +#include <linux/overflow.h>
> #include <linux/prefetch.h>
> #include <asm/cache.h>
> #include <asm/byteorder.h>
> @@ -2668,7 +2669,7 @@ void dev_net_set(struct net_device *dev, struct net *net)
> */
> static inline void *netdev_priv(const struct net_device *dev)
> {
> - return (char *)dev + ALIGN(sizeof(struct net_device), NETDEV_ALIGN);
> + return struct_data_pointer(dev, NETDEV_ALIGN);
> }
I really don't like hiding these trailing allocations from the compiler.
Why can't something like this be done (totally untested):
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 118c40258d07..dae6df4fb177 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2475,6 +2475,8 @@ struct net_device {
/** @page_pools: page pools created for this netdevice */
struct hlist_head page_pools;
#endif
+ u32 priv_size;
+ u8 priv_data[] __counted_by(priv_size) __aligned(NETDEV_ALIGN);
};
#define to_net_dev(d) container_of(d, struct net_device, dev)
@@ -2665,7 +2667,7 @@ void dev_net_set(struct net_device *dev, struct net *net)
*/
static inline void *netdev_priv(const struct net_device *dev)
{
- return (char *)dev + ALIGN(sizeof(struct net_device), NETDEV_ALIGN);
+ return dev->priv_data;
}
/* Set the sysfs physical device reference for the network logical device
diff --git a/net/core/dev.c b/net/core/dev.c
index cb2dab0feee0..afaaa3224656 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -10814,18 +10814,14 @@ struct net_device *alloc_netdev_mqs(int sizeof_priv, const char *name,
return NULL;
}
- alloc_size = sizeof(struct net_device);
- if (sizeof_priv) {
- /* ensure 32-byte alignment of private area */
- alloc_size = ALIGN(alloc_size, NETDEV_ALIGN);
- alloc_size += sizeof_priv;
- }
+ alloc_size = struct_size(p, priv_data, sizeof_priv);
/* ensure 32-byte alignment of whole construct */
alloc_size += NETDEV_ALIGN - 1;
p = kvzalloc(alloc_size, GFP_KERNEL_ACCOUNT | __GFP_RETRY_MAYFAIL);
if (!p)
return NULL;
+ p->priv_size = sizeof_priv;
dev = PTR_ALIGN(p, NETDEV_ALIGN);
dev->padded = (char *)dev - (char *)p;
--
Kees Cook
On Wed, Feb 28, 2024 at 01:46:10PM -0800, Kees Cook wrote:
> On Wed, Feb 28, 2024 at 10:41:37PM +0200, Andy Shevchenko wrote:
..
> > static inline void *netdev_priv(const struct net_device *dev)
> > {
> > - return (char *)dev + ALIGN(sizeof(struct net_device), NETDEV_ALIGN);
> > + return struct_data_pointer(dev, NETDEV_ALIGN);
> > }
>
> I really don't like hiding these trailing allocations from the compiler.
> Why can't something like this be done (totally untested):
Below is interesting idea, now at least I started understanding your previous
comments.
--
With Best Regards,
Andy Shevchenko
On Wed, 28 Feb 2024 13:46:10 -0800 Kees Cook wrote:
> I really don't like hiding these trailing allocations from the compiler.
> Why can't something like this be done (totally untested):
>
>
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index 118c40258d07..dae6df4fb177 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -2475,6 +2475,8 @@ struct net_device {
> /** @page_pools: page pools created for this netdevice */
> struct hlist_head page_pools;
> #endif
> + u32 priv_size;
> + u8 priv_data[] __counted_by(priv_size) __aligned(NETDEV_ALIGN);
I like, FWIW, please submit! :)
On Wed, Feb 28, 2024 at 02:41:48PM -0800, Jakub Kicinski wrote:
> On Wed, 28 Feb 2024 13:46:10 -0800 Kees Cook wrote:
> > I really don't like hiding these trailing allocations from the compiler.
> > Why can't something like this be done (totally untested):
> >
> >
> > diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> > index 118c40258d07..dae6df4fb177 100644
> > --- a/include/linux/netdevice.h
> > +++ b/include/linux/netdevice.h
> > @@ -2475,6 +2475,8 @@ struct net_device {
> > /** @page_pools: page pools created for this netdevice */
> > struct hlist_head page_pools;
> > #endif
> > + u32 priv_size;
> > + u8 priv_data[] __counted_by(priv_size) __aligned(NETDEV_ALIGN);
>
> I like, FWIW, please submit! :)
So, I found several cases where struct net_device is included in the
middle of another structure, which makes my proposal more awkward. But I
also don't understand why it's in the _middle_. Shouldn't it always be
at the beginning (with priv stuff following it?)
Quick search and examined manually: git grep 'struct net_device [a-z0-9_]*;'
struct rtw89_dev
struct ath10k
etc.
Some even have two included (?)
But I still like the idea -- Gustavo has been solving these cases with
having two structs, e.g.:
struct net_device {
...unchanged...
};
struct net_device_alloc {
struct net_device dev;
u32 priv_size;
u8 priv_data[] __counted_by(priv_size) __aligned(NETDEV_ALIGN);
};
And internals can use struct net_device_alloc...
-Kees
--
Kees Cook
On 2/28/24 18:01, Kees Cook wrote:
> On Wed, Feb 28, 2024 at 02:41:48PM -0800, Jakub Kicinski wrote:
>> On Wed, 28 Feb 2024 13:46:10 -0800 Kees Cook wrote:
>>> I really don't like hiding these trailing allocations from the compiler.
>>> Why can't something like this be done (totally untested):
>>>
>>>
>>> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
>>> index 118c40258d07..dae6df4fb177 100644
>>> --- a/include/linux/netdevice.h
>>> +++ b/include/linux/netdevice.h
>>> @@ -2475,6 +2475,8 @@ struct net_device {
>>> /** @page_pools: page pools created for this netdevice */
>>> struct hlist_head page_pools;
>>> #endif
>>> + u32 priv_size;
>>> + u8 priv_data[] __counted_by(priv_size) __aligned(NETDEV_ALIGN);
>>
>> I like, FWIW, please submit! :)
>
> So, I found several cases where struct net_device is included in the
> middle of another structure, which makes my proposal more awkward. But I
> also don't understand why it's in the _middle_. Shouldn't it always be
> at the beginning (with priv stuff following it?)
> Quick search and examined manually: git grep 'struct net_device [a-z0-9_]*;'
>
> struct rtw89_dev
> struct ath10k
> etc.
>
> Some even have two included (?)
>
> But I still like the idea -- Gustavo has been solving these cases with
> having two structs, e.g.:
>
> struct net_device {
> ...unchanged...
> };
>
> struct net_device_alloc {
> struct net_device dev;
> u32 priv_size;
> u8 priv_data[] __counted_by(priv_size) __aligned(NETDEV_ALIGN);
> };
>
> And internals can use struct net_device_alloc...
Yep, we should really consider going with the above, otherwise we would
have to do something like the following, to avoid having the flexible-array
member nested in the middle of other structs:
struct net_device {
struct_group_tagged(net_device_hdr, hdr,
...
u32 priv_size;
);
u8 priv_data[] __counted_by(priv_size) __aligned(NETDEV_ALIGN);
}
We are grouping together the members in `struct net_device`, except the
flexible-array member, into a tagged `struct net_device_hdr`. This allows
us to exclude the flex array from its inclusion in any other struct
that contains `struct net_device` as a member without having to create
a completely separate struct definition.
And let's take as example `struct hfi1_netdev_rx`, where `struct net_device` is
included in the beginning:
drivers/infiniband/hw/hfi1/netdev.h:
struct hfi1_netdev_rx {
- struct net_device rx_napi;
+ struct net_device_hdr rx_napi;
struct hfi1_devdata *dd;
struct hfi1_netdev_rxq *rxq;
int num_rx_q;
int rmt_start;
struct xarray dev_tbl;
/* count of enabled napi polls */
atomic_t enabled;
/* count of netdevs on top */
atomic_t netdevs;
};
Of course we would also have to update the code that access `struct net_device`
members through `rx_napi` in `struct hfi1_netdev_rx`.
I'm currently working on the above solution for all the cases where having two
separate structs is not currently feasible. And with that we are looking to enable
`-Wflex-array-member-not-at-end`
So, if we can prevent this from the beginning it'd be really great. :)
--
Gustavo
On Wed, 28 Feb 2024 16:01:49 -0800 Kees Cook wrote:
> So, I found several cases where struct net_device is included in the
> middle of another structure, which makes my proposal more awkward. But I
> also don't understand why it's in the _middle_. Shouldn't it always be
> at the beginning (with priv stuff following it?)
> Quick search and examined manually: git grep 'struct net_device [a-z0-9_]*;'
>
> struct rtw89_dev
> struct ath10k
> etc.
Ugh, yes, the (ab)use of NAPI.
> Some even have two included (?)
And some seem to be cargo-culted from out-of-tree code and are unused :S
> But I still like the idea -- Gustavo has been solving these cases with
> having two structs, e.g.:
>
> struct net_device {
> ...unchanged...
> };
>
> struct net_device_alloc {
> struct net_device dev;
> u32 priv_size;
> u8 priv_data[] __counted_by(priv_size) __aligned(NETDEV_ALIGN);
> };
>
> And internals can use struct net_device_alloc...
That's... less pretty. I'd rather push the ugly into the questionable
users. Make them either allocate the netdev dynamically and store
a pointer, or move the netdev to the end of the struct.
But yeah, that's a bit of a cleanup :(
On Wed, 28 Feb 2024 18:49:25 -0600 Gustavo A. R. Silva wrote:
> struct net_device {
> struct_group_tagged(net_device_hdr, hdr,
> ...
> u32 priv_size;
> );
> u8 priv_data[] __counted_by(priv_size) __aligned(NETDEV_ALIGN);
> }
No, no, that's not happening.
On 2/28/24 18:57, Jakub Kicinski wrote:
> On Wed, 28 Feb 2024 18:49:25 -0600 Gustavo A. R. Silva wrote:
>> struct net_device {
>> struct_group_tagged(net_device_hdr, hdr,
>> ...
>> u32 priv_size;
>> );
>> u8 priv_data[] __counted_by(priv_size) __aligned(NETDEV_ALIGN);
>> }
>
> No, no, that's not happening.
Thanks, one less flex-struct to change. :)
On Wed, Feb 28, 2024 at 04:01:49PM -0800, Kees Cook wrote:
> On Wed, Feb 28, 2024 at 02:41:48PM -0800, Jakub Kicinski wrote:
> > On Wed, 28 Feb 2024 13:46:10 -0800 Kees Cook wrote:
..
> But I still like the idea -- Gustavo has been solving these cases with
> having two structs, e.g.:
>
> struct net_device {
> ...unchanged...
> };
>
> struct net_device_alloc {
> struct net_device dev;
> u32 priv_size;
> u8 priv_data[] __counted_by(priv_size) __aligned(NETDEV_ALIGN);
> };
>
> And internals can use struct net_device_alloc...
I just realized that I made same approach in
f6d7f050e258 ("spi: Don't use flexible array in struct spi_message definition")
75e308ffc4f0 ("spi: Use struct_size() helper")
--
With Best Regards,
Andy Shevchenko
On Wed, Feb 28, 2024 at 04:56:09PM -0800, Jakub Kicinski wrote:
> On Wed, 28 Feb 2024 16:01:49 -0800 Kees Cook wrote:
> > So, I found several cases where struct net_device is included in the
> > middle of another structure, which makes my proposal more awkward. But I
> > also don't understand why it's in the _middle_. Shouldn't it always be
> > at the beginning (with priv stuff following it?)
> > Quick search and examined manually: git grep 'struct net_device [a-z0-9_]*;'
> >
> > struct rtw89_dev
> > struct ath10k
> > etc.
>
> Ugh, yes, the (ab)use of NAPI.
>
> > Some even have two included (?)
>
> And some seem to be cargo-culted from out-of-tree code and are unused :S
Ah, which can I remove?
> That's... less pretty. I'd rather push the ugly into the questionable
> users. Make them either allocate the netdev dynamically and store
> a pointer, or move the netdev to the end of the struct.
>
> But yeah, that's a bit of a cleanup :(
So far, it's not too bad. I'm doing some test builds now.
As a further aside, this code:
struct net_device *dev;
...
struct net_device *p;
...
/* ensure 32-byte alignment of whole construct */
alloc_size += NETDEV_ALIGN - 1;
p = kvzalloc(alloc_size, GFP_KERNEL_ACCOUNT | __GFP_RETRY_MAYFAIL);
...
dev = PTR_ALIGN(p, NETDEV_ALIGN);
Really screams for a dynamic-sized (bucketed) kmem_cache_alloc
API. Alignment constraints can be described in a regular kmem_cache
allocator (rather than this open-coded version). I've been intending to
build that for struct msg_msg for a while now, and here's another user. :)
-Kees
--
Kees Cook
On Thu, 29 Feb 2024 11:08:58 -0800 Kees Cook wrote:
> > And some seem to be cargo-culted from out-of-tree code and are unused :S
>
> Ah, which can I remove?
The one in igc.h does not seem to be referenced by anything in the igc
directory. Pretty sure it's unused.
> As a further aside, this code:
>
> struct net_device *dev;
> ...
> struct net_device *p;
> ...
> /* ensure 32-byte alignment of whole construct */
> alloc_size += NETDEV_ALIGN - 1;
> p = kvzalloc(alloc_size, GFP_KERNEL_ACCOUNT | __GFP_RETRY_MAYFAIL);
> ...
> dev = PTR_ALIGN(p, NETDEV_ALIGN);
>
> Really screams for a dynamic-sized (bucketed) kmem_cache_alloc
> API. Alignment constraints can be described in a regular kmem_cache
> allocator (rather than this open-coded version). I've been intending to
> build that for struct msg_msg for a while now, and here's another user. :)
TBH I'm not sure why we align it :S
NETDEV_ALIGN is 32B so maybe some old cache aligning thing?
On Thu, Feb 29, 2024 at 11:37:06AM -0800, Jakub Kicinski wrote:
> On Thu, 29 Feb 2024 11:08:58 -0800 Kees Cook wrote:
> > > And some seem to be cargo-culted from out-of-tree code and are unused :S
> >
> > Ah, which can I remove?
>
> The one in igc.h does not seem to be referenced by anything in the igc
> directory. Pretty sure it's unused.
I'll double check. After trying to do a few conversions I really hit
stuff I didn't like, so I took a slightly different approach in the
patch I sent.
--
Kees Cook