2024-03-19 09:10:34

by Breno Leitao

[permalink] [raw]
Subject: [PATCH v4] IB/hfi1: allocate dummy net_device dynamically

Embedding net_device into structures prohibits the usage of flexible
arrays in the net_device structure. For more details, see the discussion
at [1].

Un-embed the net_device from struct hfi1_netdev_rx by converting it
into a pointer. Then use the leverage alloc_netdev() to allocate the
net_device object at hfi1_alloc_rx().

[1] https://lore.kernel.org/all/[email protected]/

Signed-off-by: Breno Leitao <[email protected]>
Acked-by: Dennis Dalessandro <[email protected]>

---
Changelog

v2:
* Free struct hfi1_netdev_rx allocation if alloc_netdev() fails
* Pass zero as the private size for alloc_netdev().
* Remove wrong reference for iwl in the comments

v3:
* Re-worded the comment, by removing the first paragraph.

v4:
* Fix the changelog format
---
drivers/infiniband/hw/hfi1/netdev.h | 2 +-
drivers/infiniband/hw/hfi1/netdev_rx.c | 10 ++++++++--
2 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/netdev.h b/drivers/infiniband/hw/hfi1/netdev.h
index 8aa074670a9c..07c8f77c9181 100644
--- a/drivers/infiniband/hw/hfi1/netdev.h
+++ b/drivers/infiniband/hw/hfi1/netdev.h
@@ -49,7 +49,7 @@ struct hfi1_netdev_rxq {
* When 0 receive queues will be freed.
*/
struct hfi1_netdev_rx {
- struct net_device rx_napi;
+ struct net_device *rx_napi;
struct hfi1_devdata *dd;
struct hfi1_netdev_rxq *rxq;
int num_rx_q;
diff --git a/drivers/infiniband/hw/hfi1/netdev_rx.c b/drivers/infiniband/hw/hfi1/netdev_rx.c
index 720d4c85c9c9..cd6e78e257ef 100644
--- a/drivers/infiniband/hw/hfi1/netdev_rx.c
+++ b/drivers/infiniband/hw/hfi1/netdev_rx.c
@@ -188,7 +188,7 @@ static int hfi1_netdev_rxq_init(struct hfi1_netdev_rx *rx)
int i;
int rc;
struct hfi1_devdata *dd = rx->dd;
- struct net_device *dev = &rx->rx_napi;
+ struct net_device *dev = rx->rx_napi;

rx->num_rx_q = dd->num_netdev_contexts;
rx->rxq = kcalloc_node(rx->num_rx_q, sizeof(*rx->rxq),
@@ -360,7 +360,12 @@ int hfi1_alloc_rx(struct hfi1_devdata *dd)
if (!rx)
return -ENOMEM;
rx->dd = dd;
- init_dummy_netdev(&rx->rx_napi);
+ rx->rx_napi = alloc_netdev(0, "dummy", NET_NAME_UNKNOWN,
+ init_dummy_netdev);
+ if (!rx->rx_napi) {
+ kfree(rx);
+ return -ENOMEM;
+ }

xa_init(&rx->dev_tbl);
atomic_set(&rx->enabled, 0);
@@ -374,6 +379,7 @@ void hfi1_free_rx(struct hfi1_devdata *dd)
{
if (dd->netdev_rx) {
dd_dev_info(dd, "hfi1 rx freed\n");
+ free_netdev(dd->netdev_rx->rx_napi);
kfree(dd->netdev_rx);
dd->netdev_rx = NULL;
}
--
2.43.0



2024-04-01 11:53:44

by Leon Romanovsky

[permalink] [raw]
Subject: Re: [PATCH v4] IB/hfi1: allocate dummy net_device dynamically

On Tue, Mar 19, 2024 at 02:09:43AM -0700, Breno Leitao wrote:
> Embedding net_device into structures prohibits the usage of flexible
> arrays in the net_device structure. For more details, see the discussion
> at [1].
>
> Un-embed the net_device from struct hfi1_netdev_rx by converting it
> into a pointer. Then use the leverage alloc_netdev() to allocate the
> net_device object at hfi1_alloc_rx().
>
> [1] https://lore.kernel.org/all/[email protected]/
>
> Signed-off-by: Breno Leitao <[email protected]>
> Acked-by: Dennis Dalessandro <[email protected]>

Jakub,

I create shared branch for you, please pull it from:
https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git/log/?h=remove-dummy-netdev

Thanks

2024-04-01 11:53:59

by Leon Romanovsky

[permalink] [raw]
Subject: Re: [PATCH v4] IB/hfi1: allocate dummy net_device dynamically


On Tue, 19 Mar 2024 02:09:43 -0700, Breno Leitao wrote:
> Embedding net_device into structures prohibits the usage of flexible
> arrays in the net_device structure. For more details, see the discussion
> at [1].
>
> Un-embed the net_device from struct hfi1_netdev_rx by converting it
> into a pointer. Then use the leverage alloc_netdev() to allocate the
> net_device object at hfi1_alloc_rx().
>
> [...]

Applied, thanks!

[1/1] IB/hfi1: allocate dummy net_device dynamically
https://git.kernel.org/rdma/rdma/c/c965b039a750c4

Best regards,
--
Leon Romanovsky <[email protected]>


2024-04-01 14:53:16

by Jakub Kicinski

[permalink] [raw]
Subject: Re: [PATCH v4] IB/hfi1: allocate dummy net_device dynamically

On Mon, 1 Apr 2024 14:53:31 +0300 Leon Romanovsky wrote:
> On Tue, Mar 19, 2024 at 02:09:43AM -0700, Breno Leitao wrote:
> > Embedding net_device into structures prohibits the usage of flexible
> > arrays in the net_device structure. For more details, see the discussion
> > at [1].
> >
> > Un-embed the net_device from struct hfi1_netdev_rx by converting it
> > into a pointer. Then use the leverage alloc_netdev() to allocate the
> > net_device object at hfi1_alloc_rx().
> >
> > [1] https://lore.kernel.org/all/[email protected]/
> >
> > Signed-off-by: Breno Leitao <[email protected]>
> > Acked-by: Dennis Dalessandro <[email protected]>
>
> Jakub,
>
> I create shared branch for you, please pull it from:
> https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git/log/?h=remove-dummy-netdev

Did you merge it in already?
Turned out that the use of init_dummy_netdev as a setup function
is broken, I'm not sure how Dennis tested this :(
We should have pinged you, sorry.

2024-04-01 15:34:46

by Dennis Dalessandro

[permalink] [raw]
Subject: Re: [PATCH v4] IB/hfi1: allocate dummy net_device dynamically

On 4/1/24 10:53 AM, Jakub Kicinski wrote:
> On Mon, 1 Apr 2024 14:53:31 +0300 Leon Romanovsky wrote:
>> On Tue, Mar 19, 2024 at 02:09:43AM -0700, Breno Leitao wrote:
>>> Embedding net_device into structures prohibits the usage of flexible
>>> arrays in the net_device structure. For more details, see the discussion
>>> at [1].
>>>
>>> Un-embed the net_device from struct hfi1_netdev_rx by converting it
>>> into a pointer. Then use the leverage alloc_netdev() to allocate the
>>> net_device object at hfi1_alloc_rx().
>>>
>>> [1] https://lore.kernel.org/all/[email protected]/
>>>
>>> Signed-off-by: Breno Leitao <[email protected]>
>>> Acked-by: Dennis Dalessandro <[email protected]>
>>
>> Jakub,
>>
>> I create shared branch for you, please pull it from:
>> https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git/log/?h=remove-dummy-netdev
>
> Did you merge it in already?
> Turned out that the use of init_dummy_netdev as a setup function
> is broken, I'm not sure how Dennis tested this :(
> We should have pinged you, sorry.

This is what I tested, Linus 6.8 tag + cherry pick + Breno patch. So if
something went in that broke it I didn't have it in my tree.

commit 311810a6d7e37d8e7537d50e26197b7f5f02f164 (linus-master)
Author: Breno Leitao <[email protected]>
Date: Wed Mar 13 03:33:10 2024 -0700

IB/hfi1: allocate dummy net_device dynamically

struct net_device shouldn't be embedded into any structure, instead,
the owner should use the priv space to embed their state into net_device.

Embedding net_device into structures prohibits the usage of flexible
arrays in the net_device structure. For more details, see the discussion
at [1].

Un-embed the net_device from struct hfi1_netdev_rx by converting it
into a pointer. Then use the leverage alloc_netdev() to allocate the
net_device object at hfi1_alloc_rx().

[1] https://lore.kernel.org/all/[email protected]/

Signed-off-by: Breno Leitao <[email protected]>

----
PS: this diff needs d160c66cda0ac8614 ("net: Do not return value from
init_dummy_netdev()") in order to apply and build cleanly.

commit 1e06cffe69e6519f8ede42c60f13ad3a7ddb09b7
Author: Amit Cohen <[email protected]>
Date: Mon Feb 5 12:30:22 2024 +0200

net: Do not return value from init_dummy_netdev()

init_dummy_netdev() always returns zero and all the callers do not check
the returned value. Set the function to not return value, as it is not
really used today.

Signed-off-by: Amit Cohen <[email protected]>
Reviewed-by: Ido Schimmel <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

commit e8f897f4afef0031fe618a8e94127a0934896aba (tag: v6.8)
Author: Linus Torvalds <[email protected]>
Date: Sun Mar 10 13:38:09 2024 -0700

Linux 6.8




2024-04-01 18:00:44

by Leon Romanovsky

[permalink] [raw]
Subject: Re: [PATCH v4] IB/hfi1: allocate dummy net_device dynamically

On Mon, Apr 01, 2024 at 07:53:06AM -0700, Jakub Kicinski wrote:
> On Mon, 1 Apr 2024 14:53:31 +0300 Leon Romanovsky wrote:
> > On Tue, Mar 19, 2024 at 02:09:43AM -0700, Breno Leitao wrote:
> > > Embedding net_device into structures prohibits the usage of flexible
> > > arrays in the net_device structure. For more details, see the discussion
> > > at [1].
> > >
> > > Un-embed the net_device from struct hfi1_netdev_rx by converting it
> > > into a pointer. Then use the leverage alloc_netdev() to allocate the
> > > net_device object at hfi1_alloc_rx().
> > >
> > > [1] https://lore.kernel.org/all/[email protected]/
> > >
> > > Signed-off-by: Breno Leitao <[email protected]>
> > > Acked-by: Dennis Dalessandro <[email protected]>
> >
> > Jakub,
> >
> > I create shared branch for you, please pull it from:
> > https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git/log/?h=remove-dummy-netdev
>
> Did you merge it in already?

I merged it into testing branch and dropped it now.

Thanks

2024-04-03 12:18:20

by Breno Leitao

[permalink] [raw]
Subject: Re: [PATCH v4] IB/hfi1: allocate dummy net_device dynamically

On Mon, Apr 01, 2024 at 11:34:23AM -0400, Dennis Dalessandro wrote:
> On 4/1/24 10:53 AM, Jakub Kicinski wrote:
> > On Mon, 1 Apr 2024 14:53:31 +0300 Leon Romanovsky wrote:
> >> On Tue, Mar 19, 2024 at 02:09:43AM -0700, Breno Leitao wrote:
> >>> Embedding net_device into structures prohibits the usage of flexible
> >>> arrays in the net_device structure. For more details, see the discussion
> >>> at [1].
> >>>
> >>> Un-embed the net_device from struct hfi1_netdev_rx by converting it
> >>> into a pointer. Then use the leverage alloc_netdev() to allocate the
> >>> net_device object at hfi1_alloc_rx().
> >>>
> >>> [1] https://lore.kernel.org/all/[email protected]/
> >>>
> >>> Signed-off-by: Breno Leitao <[email protected]>
> >>> Acked-by: Dennis Dalessandro <[email protected]>
> >>
> >> Jakub,
> >>
> >> I create shared branch for you, please pull it from:
> >> https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git/log/?h=remove-dummy-netdev
> >
> > Did you merge it in already?
> > Turned out that the use of init_dummy_netdev as a setup function
> > is broken, I'm not sure how Dennis tested this :(
> > We should have pinged you, sorry.
>
> This is what I tested, Linus 6.8 tag + cherry pick + Breno patch. So if
> something went in that broke it I didn't have it in my tree.
>
> commit 311810a6d7e37d8e7537d50e26197b7f5f02f164 (linus-master)
> Author: Breno Leitao <[email protected]>
> Date: Wed Mar 13 03:33:10 2024 -0700
>
> IB/hfi1: allocate dummy net_device dynamically

This one has a potential bug that causes a kernel panic when the module
is removed.

This is because alloc_netdev() allocates some data structures that are
later overwritten (memset) by init_dummy_netdev(). At the free time,
free_netdev() will dereference those structures and they are zero.

A new upcoming patch is creating a helper (init_dummy_netdev()) that
will allocate the netdev and call a special version of
init_dummy_netdev() without memsetting the structure.

I would drop this patch for now, and I will submit a new version using
the new helper.