2023-12-20 23:14:31

by Jim Harris

[permalink] [raw]
Subject: [PATCH 0/2] pci/iov: avoid device_lock() when reading sriov_numvfs

If SR-IOV enabled device is held by vfio, and device is removed, vfio will hold
device lock and notify userspace of the removal. If userspace reads sriov_numvfs
sysfs entry, that thread will be blocked since sriov_numvfs_show() also tries
to acquire the device lock. If that same thread is responsible for releasing the
device to vfio, it results in a deadlock.

One patch was proposed to add a separate mutex, specifically for struct pci_sriov,
to synchronize access to sriov_numvfs in the sysfs paths (replacing use of the
device_lock()). Leon instead suggested just reverting the commit 35ff867b765 which
introduced device_lock() in the store path. This also led to a small fix around
ordering on the kobject_uevent() when sriov_numvfs is updated.

Ref: https://lore.kernel.org/linux-pci/ZXJI5+f8bUelVXqu@ubuntu/

---

Jim Harris (2):
Revert "PCI/IOV: Serialize sysfs sriov_numvfs reads vs writes"
pci/iov: fix kobject_uevent() ordering in sriov_enable()


drivers/pci/iov.c | 10 ++--------
1 file changed, 2 insertions(+), 8 deletions(-)

--


2023-12-20 23:25:32

by Jim Harris

[permalink] [raw]
Subject: [PATCH 2/2] pci/iov: fix kobject_uevent() ordering in sriov_enable()

Wait to call kobject_uevent() until all of the associated changes are done,
including updating the num_VFs value.

Suggested by: Leon Romanovsky <[email protected]>
Signed-off-by: Jim Harris <[email protected]>
---
drivers/pci/iov.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
index d4646bdcd887..7a0f33ef1826 100644
--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -677,8 +677,8 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
if (rc)
goto err_pcibios;

- kobject_uevent(&dev->dev.kobj, KOBJ_CHANGE);
iov->num_VFs = nr_virtfn;
+ kobject_uevent(&dev->dev.kobj, KOBJ_CHANGE);

return 0;


2023-12-25 11:20:25

by Leon Romanovsky

[permalink] [raw]
Subject: Re: [PATCH 2/2] pci/iov: fix kobject_uevent() ordering in sriov_enable()

On Wed, Dec 20, 2023 at 10:58:22PM +0000, Jim Harris wrote:
> Wait to call kobject_uevent() until all of the associated changes are done,
> including updating the num_VFs value.
>
> Suggested by: Leon Romanovsky <[email protected]>
> Signed-off-by: Jim Harris <[email protected]>
> ---
> drivers/pci/iov.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>

Thanks,
Reviewed-by: Leon Romanovsky <[email protected]>

2024-02-09 00:30:11

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH 0/2] pci/iov: avoid device_lock() when reading sriov_numvfs

[+cc Pierre, author of 35ff867b7657 ("PCI/IOV: Serialize sysfs
sriov_numvfs reads vs writes")]

On Wed, Dec 20, 2023 at 10:58:12PM +0000, Jim Harris wrote:
> If SR-IOV enabled device is held by vfio, and device is removed,
> vfio will hold device lock and notify userspace of the removal. If
> userspace reads sriov_numvfs sysfs entry, that thread will be
> blocked since sriov_numvfs_show() also tries to acquire the device
> lock. If that same thread is responsible for releasing the device to
> vfio, it results in a deadlock.
>
> One patch was proposed to add a separate mutex, specifically for
> struct pci_sriov, to synchronize access to sriov_numvfs in the sysfs
> paths (replacing use of the device_lock()). Leon instead suggested
> just reverting the commit 35ff867b765 which introduced device_lock()
> in the store path. This also led to a small fix around ordering on
> the kobject_uevent() when sriov_numvfs is updated.
>
> Ref: https://lore.kernel.org/linux-pci/ZXJI5+f8bUelVXqu@ubuntu/

1) Cc author of the commit being reverted (Pierre) so he has a chance
to chime in and make sure the proposed fix works for him as well.

2) The revert commit log needs to justify the revert, not merely say
what the proper way is. The Ref: above suggests that the current code
(pre-revert) leads to a deadlock in some cases, so the revert commit
log should detail that.

It's ideal if we never regress, not even between the revert and the
second patch, so it's possible that they should be squashed into a
single patch. But if you keep it as two patches, it's trivial for me
to squash them if we decide that's best.

3) Follow subject line convention for drivers/pci (use "git log
--oneline drivers/pci" to learn it).

I did 1) here and could do 3) for you, but it would be better if you
could update and repost the series with 2) updated.

In the meantime you may notice that I pushed these on a
pci/virtualization just to get the 0-day bot to build test it. I
propose to replace that branch with an updated series, since the code
changes themselves probably will stay the same.

> ---
>
> Jim Harris (2):
> Revert "PCI/IOV: Serialize sysfs sriov_numvfs reads vs writes"
> pci/iov: fix kobject_uevent() ordering in sriov_enable()
>
>
> drivers/pci/iov.c | 10 ++--------
> 1 file changed, 2 insertions(+), 8 deletions(-)
>
> --

2024-02-09 23:15:41

by Jim Harris

[permalink] [raw]
Subject: Re: [PATCH 0/2] pci/iov: avoid device_lock() when reading sriov_numvfs

On Thu, Feb 08, 2024 at 06:30:02PM -0600, Bjorn Helgaas wrote:
> [+cc Pierre, author of 35ff867b7657 ("PCI/IOV: Serialize sysfs
> sriov_numvfs reads vs writes")]
>
> On Wed, Dec 20, 2023 at 10:58:12PM +0000, Jim Harris wrote:
> > If SR-IOV enabled device is held by vfio, and device is removed,
> > vfio will hold device lock and notify userspace of the removal. If
> > userspace reads sriov_numvfs sysfs entry, that thread will be
> > blocked since sriov_numvfs_show() also tries to acquire the device
> > lock. If that same thread is responsible for releasing the device to
> > vfio, it results in a deadlock.
> >
> > One patch was proposed to add a separate mutex, specifically for
> > struct pci_sriov, to synchronize access to sriov_numvfs in the sysfs
> > paths (replacing use of the device_lock()). Leon instead suggested
> > just reverting the commit 35ff867b765 which introduced device_lock()
> > in the store path. This also led to a small fix around ordering on
> > the kobject_uevent() when sriov_numvfs is updated.
> >
> > Ref: https://lore.kernel.org/linux-pci/ZXJI5+f8bUelVXqu@ubuntu/
>
> 1) Cc author of the commit being reverted (Pierre) so he has a chance
> to chime in and make sure the proposed fix works for him as well.

Ack. I'll also Cc Pierre on the v2.

> 2) The revert commit log needs to justify the revert, not merely say
> what the proper way is. The Ref: above suggests that the current code
> (pre-revert) leads to a deadlock in some cases, so the revert commit
> log should detail that.
>
> It's ideal if we never regress, not even between the revert and the
> second patch, so it's possible that they should be squashed into a
> single patch. But if you keep it as two patches, it's trivial for me
> to squash them if we decide that's best.

The deadlock I hit is fixed by patch 1 alone. Patch 2 is a separate
bug - it's better to update the num_VFs value before sending the notification
that the num_VFs value changed.

I'll add some more color to that commit message too, to differentiate it
from the revert. I have no issues if you eventually decide to squash them.
>
> 3) Follow subject line convention for drivers/pci (use "git log
> --oneline drivers/pci" to learn it).

Will fix in v2.

Thanks,

Jim