2021-06-21 15:34:20

by Íñigo Huguet

[permalink] [raw]
Subject: [PATCH 1/4] sfc: avoid double pci_remove of VFs

If pci_remove was called for a PF with VFs, the removal of the VFs was
called twice from efx_ef10_sriov_fini: one directly with pci_driver->remove
and another implicit by calling pci_disable_sriov, which also perform
the VFs remove. This was leading to crashing the kernel on the second
attempt.

Given that pci_disable_sriov already calls to pci remove function, get
rid of the direct call to pci_driver->remove from the driver.

2 different ways to trigger the bug:
- Create one or more VFs, then attach the PF to a virtual machine (at
least with qemu/KVM)
- Create one or more VFs, then remove the PF with:
echo 1 > /sys/bus/pci/devices/PF_PCI_ID/remove

Removing sfc module does not trigger the error, at least for me, because
it removes the VF first, and then the PF.

Example of a log with the error:
list_del corruption, ffff967fd20a8ad0->next is LIST_POISON1 (dead000000000100)
------------[ cut here ]------------
kernel BUG at lib/list_debug.c:47!
[...trimmed...]
RIP: 0010:__list_del_entry_valid.cold.1+0x12/0x4c
[...trimmed...]
Call Trace:
efx_dissociate+0x1f/0x140 [sfc]
efx_pci_remove+0x27/0x150 [sfc]
pci_device_remove+0x3b/0xc0
device_release_driver_internal+0x103/0x1f0
pci_stop_bus_device+0x69/0x90
pci_stop_and_remove_bus_device+0xe/0x20
pci_iov_remove_virtfn+0xba/0x120
sriov_disable+0x2f/0xe0
efx_ef10_pci_sriov_disable+0x52/0x80 [sfc]
? pcie_aer_is_native+0x12/0x40
efx_ef10_sriov_fini+0x72/0x110 [sfc]
efx_pci_remove+0x62/0x150 [sfc]
pci_device_remove+0x3b/0xc0
device_release_driver_internal+0x103/0x1f0
unbind_store+0xf6/0x130
kernfs_fop_write+0x116/0x190
vfs_write+0xa5/0x1a0
ksys_write+0x4f/0xb0
do_syscall_64+0x5b/0x1a0
entry_SYSCALL_64_after_hwframe+0x65/0xca

Signed-off-by: Íñigo Huguet <[email protected]>
---
drivers/net/ethernet/sfc/ef10_sriov.c | 10 +---------
1 file changed, 1 insertion(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/sfc/ef10_sriov.c b/drivers/net/ethernet/sfc/ef10_sriov.c
index 21fa6c0e8873..a5d28b0f75ba 100644
--- a/drivers/net/ethernet/sfc/ef10_sriov.c
+++ b/drivers/net/ethernet/sfc/ef10_sriov.c
@@ -439,7 +439,6 @@ int efx_ef10_sriov_init(struct efx_nic *efx)
void efx_ef10_sriov_fini(struct efx_nic *efx)
{
struct efx_ef10_nic_data *nic_data = efx->nic_data;
- unsigned int i;
int rc;

if (!nic_data->vf) {
@@ -449,14 +448,7 @@ void efx_ef10_sriov_fini(struct efx_nic *efx)
return;
}

- /* Remove any VFs in the host */
- for (i = 0; i < efx->vf_count; ++i) {
- struct efx_nic *vf_efx = nic_data->vf[i].efx;
-
- if (vf_efx)
- vf_efx->pci_dev->driver->remove(vf_efx->pci_dev);
- }
-
+ /* Disable SRIOV and remove any VFs in the host */
rc = efx_ef10_pci_sriov_disable(efx, true);
if (rc)
netif_dbg(efx, drv, efx->net_dev,
--
2.31.1


2021-06-21 15:34:34

by Íñigo Huguet

[permalink] [raw]
Subject: [PATCH 2/4] sfc: error code if SRIOV cannot be disabled

If SRIOV cannot be disabled during device removal or module unloading,
return error code so it can be logged properly in the calling function.

Note that this can only happen if any VF is currently attached to a
guest using Xen, but not with vfio/KVM. Despite that in that case the
VFs won't work properly with PF removed and/or the module unloaded, I
have let it as is because I don't know what side effects may have
changing it, and also it seems to be the same that other drivers are
doing in this situation.

In the case of being called during SRIOV reconfiguration, the behavior
hasn't changed because the function is called with force=false.

Signed-off-by: Íñigo Huguet <[email protected]>
---
drivers/net/ethernet/sfc/ef10_sriov.c | 15 +++++++++++----
1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/sfc/ef10_sriov.c b/drivers/net/ethernet/sfc/ef10_sriov.c
index a5d28b0f75ba..84041cd587d7 100644
--- a/drivers/net/ethernet/sfc/ef10_sriov.c
+++ b/drivers/net/ethernet/sfc/ef10_sriov.c
@@ -402,12 +402,17 @@ static int efx_ef10_pci_sriov_enable(struct efx_nic *efx, int num_vfs)
return rc;
}

+/* Disable SRIOV and remove VFs
+ * If some VFs are attached to a guest (using Xen, only) nothing is
+ * done if force=false, and vports are freed if force=true (for the non
+ * attachedc ones, only) but SRIOV is not disabled and VFs are not
+ * removed in either case.
+ */
static int efx_ef10_pci_sriov_disable(struct efx_nic *efx, bool force)
{
struct pci_dev *dev = efx->pci_dev;
- unsigned int vfs_assigned = 0;
-
- vfs_assigned = pci_vfs_assigned(dev);
+ unsigned int vfs_assigned = pci_vfs_assigned(dev);
+ int rc = 0;

if (vfs_assigned && !force) {
netif_info(efx, drv, efx->net_dev, "VFs are assigned to guests; "
@@ -417,10 +422,12 @@ static int efx_ef10_pci_sriov_disable(struct efx_nic *efx, bool force)

if (!vfs_assigned)
pci_disable_sriov(dev);
+ else
+ rc = -EBUSY;

efx_ef10_sriov_free_vf_vswitching(efx);
efx->vf_count = 0;
- return 0;
+ return rc;
}

int efx_ef10_sriov_configure(struct efx_nic *efx, int num_vfs)
--
2.31.1

2021-06-21 15:34:57

by Íñigo Huguet

[permalink] [raw]
Subject: [PATCH 3/4] sfc: explain that "attached" VFs only refer to Xen

During SRIOV disabling it is checked wether any VF is currently attached
to a guest, using pci_vfs_assigned function. However, this check only
works with VFs attached with Xen, not with vfio/KVM. Added comments
clarifying this point.

Also, replaced manual check of PCI_DEV_FLAGS_ASSIGNED flag and used the
helper function pci_is_dev_assigned instead.

Signed-off-by: Íñigo Huguet <[email protected]>
---
drivers/net/ethernet/sfc/ef10.c | 3 ++-
drivers/net/ethernet/sfc/ef10_sriov.c | 7 ++++---
2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/sfc/ef10.c b/drivers/net/ethernet/sfc/ef10.c
index c3f35da1b82a..bea961013f7c 100644
--- a/drivers/net/ethernet/sfc/ef10.c
+++ b/drivers/net/ethernet/sfc/ef10.c
@@ -1070,7 +1070,8 @@ static int efx_ef10_probe_vf(struct efx_nic *efx)

/* If the parent PF has no VF data structure, it doesn't know about this
* VF so fail probe. The VF needs to be re-created. This can happen
- * if the PF driver is unloaded while the VF is assigned to a guest.
+ * if the PF driver was unloaded while any VF was assigned to a guest
+ * (using Xen, only).
*/
pci_dev_pf = efx->pci_dev->physfn;
if (pci_dev_pf) {
diff --git a/drivers/net/ethernet/sfc/ef10_sriov.c b/drivers/net/ethernet/sfc/ef10_sriov.c
index 84041cd587d7..f8f8fbe51ef8 100644
--- a/drivers/net/ethernet/sfc/ef10_sriov.c
+++ b/drivers/net/ethernet/sfc/ef10_sriov.c
@@ -122,8 +122,7 @@ static void efx_ef10_sriov_free_vf_vports(struct efx_nic *efx)
struct ef10_vf *vf = nic_data->vf + i;

/* If VF is assigned, do not free the vport */
- if (vf->pci_dev &&
- vf->pci_dev->dev_flags & PCI_DEV_FLAGS_ASSIGNED)
+ if (vf->pci_dev && pci_is_dev_assigned(vf->pci_dev))
continue;

if (vf->vport_assigned) {
@@ -449,7 +448,9 @@ void efx_ef10_sriov_fini(struct efx_nic *efx)
int rc;

if (!nic_data->vf) {
- /* Remove any un-assigned orphaned VFs */
+ /* Remove any un-assigned orphaned VFs. This can happen if the PF driver
+ * was unloaded while any VF was assigned to a guest (using Xen, only).
+ */
if (pci_num_vf(efx->pci_dev) && !pci_vfs_assigned(efx->pci_dev))
pci_disable_sriov(efx->pci_dev);
return;
--
2.31.1

2021-06-21 15:37:42

by Íñigo Huguet

[permalink] [raw]
Subject: [PATCH 4/4] sfc: avoid duplicated code in ef10_sriov

The fail path of efx_ef10_sriov_alloc_vf_vswitching is identical to the
full content of efx_ef10_sriov_free_vf_vswitching, so replace it for a
single call to efx_ef10_sriov_free_vf_vswitching.

Signed-off-by: Íñigo Huguet <[email protected]>
---
drivers/net/ethernet/sfc/ef10_sriov.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/sfc/ef10_sriov.c b/drivers/net/ethernet/sfc/ef10_sriov.c
index f8f8fbe51ef8..752d6406f07e 100644
--- a/drivers/net/ethernet/sfc/ef10_sriov.c
+++ b/drivers/net/ethernet/sfc/ef10_sriov.c
@@ -206,9 +206,7 @@ static int efx_ef10_sriov_alloc_vf_vswitching(struct efx_nic *efx)

return 0;
fail:
- efx_ef10_sriov_free_vf_vports(efx);
- kfree(nic_data->vf);
- nic_data->vf = NULL;
+ efx_ef10_sriov_free_vf_vswitching(efx);
return rc;
}

--
2.31.1