2022-08-08 08:42:00

by Lin Ma

[permalink] [raw]
Subject: [PATCH v0] idb: Add rtnl_lock to avoid data race

The commit c23d92b80e0b ("igb: Teardown SR-IOV before
unregister_netdev()") places the unregister_netdev() call after the
igb_disable_sriov() call to avoid functionality issue.

However, it introduces several race conditions when detaching a device.
For example, when .remove() is called, the below interleaving leads to
use-after-free.

(FREE from device detaching) | (USE from netdev core)
igb_remove | igb_ndo_get_vf_config
igb_disable_sriov | vf >= adapter->vfs_allocated_count?
kfree(adapter->vf_data) |
adapter->vfs_allocated_count = 0 |
| memcpy(... adapter->vf_data[vf]

In short, there are data races between read and write of
adapter->vfs_allocated_count. To fix this, we can add a new lock to
protect members in adapter object. However, we cau use the existing
rtnl_lock just as other drivers do. (See how dpaa2_eth_disconnect_mac is
protected in dpaa2_eth_remove function). This patch adopts similar
fixes.

Fixes: c23d92b80e0b ("igb: Teardown SR-IOV before unregister_netdev()")
Signed-off-by: Lin Ma <[email protected]>
---
drivers/net/ethernet/intel/igb/igb_main.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index d8b836a85cc3..e86ea4de05f8 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -3814,7 +3814,9 @@ static void igb_remove(struct pci_dev *pdev)
igb_release_hw_control(adapter);

#ifdef CONFIG_PCI_IOV
+ rtnl_lock();
igb_disable_sriov(pdev);
+ rtnl_unlock();
#endif

unregister_netdev(netdev);
--
2.36.1


2022-08-08 19:30:30

by Jakub Kicinski

[permalink] [raw]
Subject: Re: [PATCH v0] idb: Add rtnl_lock to avoid data race

On Mon, 8 Aug 2022 16:10:50 +0800 Lin Ma wrote:
> The commit c23d92b80e0b ("igb: Teardown SR-IOV before
> unregister_netdev()") places the unregister_netdev() call after the
> igb_disable_sriov() call to avoid functionality issue.
>
> However, it introduces several race conditions when detaching a device.
> For example, when .remove() is called, the below interleaving leads to
> use-after-free.
>
> (FREE from device detaching) | (USE from netdev core)
> igb_remove | igb_ndo_get_vf_config
> igb_disable_sriov | vf >= adapter->vfs_allocated_count?
> kfree(adapter->vf_data) |
> adapter->vfs_allocated_count = 0 |
> | memcpy(... adapter->vf_data[vf]
>
> In short, there are data races between read and write of
> adapter->vfs_allocated_count. To fix this, we can add a new lock to
> protect members in adapter object. However, we cau use the existing
> rtnl_lock just as other drivers do. (See how dpaa2_eth_disconnect_mac is
> protected in dpaa2_eth_remove function). This patch adopts similar
> fixes.
>
> Fixes: c23d92b80e0b ("igb: Teardown SR-IOV before unregister_netdev()")
> Signed-off-by: Lin Ma <[email protected]>
> ---
> drivers/net/ethernet/intel/igb/igb_main.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
> index d8b836a85cc3..e86ea4de05f8 100644
> --- a/drivers/net/ethernet/intel/igb/igb_main.c
> +++ b/drivers/net/ethernet/intel/igb/igb_main.c
> @@ -3814,7 +3814,9 @@ static void igb_remove(struct pci_dev *pdev)
> igb_release_hw_control(adapter);
>
> #ifdef CONFIG_PCI_IOV
> + rtnl_lock();
> igb_disable_sriov(pdev);
> + rtnl_unlock();
> #endif
>
> unregister_netdev(netdev);

What about the disable path coming from sysfs? This looks incomplete to
me. Perhaps take a look at commit 1e53834ce541 ("ixgbe: Add locking to
prevent panic when setting sriov_numvfs to zero") for some inspiration.

2022-08-08 20:55:10

by Edward Cree

[permalink] [raw]
Subject: Re: [PATCH v0] idb: Add rtnl_lock to avoid data race

s/idb/igb in Subject?

-ed

2022-08-09 03:15:43

by Lin Ma

[permalink] [raw]
Subject: Re: [PATCH v0] idb: Add rtnl_lock to avoid data race

Hello there,

>
> What about the disable path coming from sysfs? This looks incomplete to
> me. Perhaps take a look at commit 1e53834ce541 ("ixgbe: Add locking to
> prevent panic when setting sriov_numvfs to zero") for some inspiration.

Thanks for the advice, I sent the new version of the patch which uses a new spinlock to avoid race cases such as described in commit 1e53834ce541.

Additionally, I also keep the rtnl_lock to eliminate the races that come from netdev core. Although this can also be handled with the newly added spinlock, I found that adding the spinlock every time accessing the VF resources is not trivial.
(If you think that keep using the spinlock is better I will craft a new version of patch)

It seems that ixgbe_disable_sriov also suffers from the mentioned races from netdev core. If you think the rtnl_lock solution is fine, I will also send a patch for that driver too.

Thanks
Lin Ma