2022-09-09 03:06:04

by Tian, Kevin

[permalink] [raw]
Subject: [PATCH v3 00/15] Tidy up vfio_device life cycle

The idea is to let vfio core manage the vfio_device life cycle instead
of duplicating the logic cross drivers. Besides cleaner code in driver
side this also allows adding struct device to vfio_device as the first
step toward adding cdev uAPI in the future. Another benefit is that
user can now look at sysfs to decide whether a device is bound to
vfio [1], e.g.:

/sys/devices/pci0000\:6f/0000\:6f\:01.0/vfio-dev/vfio0

Though most drivers can fit the new model naturally:

- vfio_alloc_device() to allocate and initialize vfio_device
- vfio_put_device() to release vfio_device
- dev_ops->init() for driver private initialization
- dev_ops->release() for driver private cleanup

vfio-ccw is the only exception due to a life cycle mess that its private
structure mixes both parent and mdev info hence must be alloc/freed
outside of the life cycle of vfio device.

Per prior discussions this won't be fixed in short term by IBM folks [2].

Instead of waiting this series introduces a few tricks to move forward:

- vfio_init_device() to initialize a pre-allocated device structure;

- require *EVERY* driver to implement @release and free vfio_device
inside. Then vfio-ccw can use a completion mechanism to delay the
free to css driver;

The second trick is not a real burden to other drivers because they
all require a @release for private cleanup anyay. Later once the ccw
mess is fixed a simple cleanup can be done by moving free from @release
to vfio core.

Thanks
Kevin

[1] https://listman.redhat.com/archives/libvir-list/2022-August/233482.html
[2] https://lore.kernel.org/all/[email protected]/

v3:
- rebase to vfio-next after resolving conflicts with Yishai's series
- add missing fixes for two checkpatch errors
- fix grammar issues (Eric Auger)
- add more r-b's

v2:
- https://lore.kernel.org/lkml/[email protected]/
- rebase to 6.0-rc3
- fix build warnings (lkp)
- patch1: remove unnecessary forward reference (Jason)
- patch10: leave device_set released by vfio core (Jason)
- patch13: add Suggested-by
- patch15: add ABI file sysfs-devices-vfio-dev (Alex)
- patch15: rename 'vfio' to 'vfio_group' in procfs (Jason)

v1: https://lore.kernel.org/lkml/[email protected]/

Kevin Tian (6):
vfio: Add helpers for unifying vfio_device life cycle
drm/i915/gvt: Use the new device life cycle helpers
vfio/platform: Use the new device life cycle helpers
vfio/amba: Use the new device life cycle helpers
vfio/ccw: Use the new device life cycle helpers
vfio: Rename vfio_device_put() and vfio_device_try_get()

Yi Liu (9):
vfio/pci: Use the new device life cycle helpers
vfio/mlx5: Use the new device life cycle helpers
vfio/hisi_acc: Use the new device life cycle helpers
vfio/mdpy: Use the new device life cycle helpers
vfio/mtty: Use the new device life cycle helpers
vfio/mbochs: Use the new device life cycle helpers
vfio/ap: Use the new device life cycle helpers
vfio/fsl-mc: Use the new device life cycle helpers
vfio: Add struct device to vfio_device

.../ABI/testing/sysfs-devices-vfio-dev | 8 +
MAINTAINERS | 1 +
drivers/gpu/drm/i915/gvt/gvt.h | 5 +-
drivers/gpu/drm/i915/gvt/kvmgt.c | 52 ++++--
drivers/gpu/drm/i915/gvt/vgpu.c | 33 ++--
drivers/s390/cio/vfio_ccw_ops.c | 52 +++++-
drivers/s390/cio/vfio_ccw_private.h | 3 +
drivers/s390/crypto/vfio_ap_ops.c | 50 +++---
drivers/vfio/fsl-mc/vfio_fsl_mc.c | 85 +++++----
.../vfio/pci/hisilicon/hisi_acc_vfio_pci.c | 80 ++++-----
drivers/vfio/pci/mlx5/main.c | 50 ++++--
drivers/vfio/pci/vfio_pci.c | 20 +--
drivers/vfio/pci/vfio_pci_core.c | 23 ++-
drivers/vfio/platform/vfio_amba.c | 72 ++++++--
drivers/vfio/platform/vfio_platform.c | 66 +++++--
drivers/vfio/platform/vfio_platform_common.c | 71 +++-----
drivers/vfio/platform/vfio_platform_private.h | 18 +-
drivers/vfio/vfio_main.c | 167 +++++++++++++++---
include/linux/vfio.h | 28 ++-
include/linux/vfio_pci_core.h | 6 +-
samples/vfio-mdev/mbochs.c | 73 +++++---
samples/vfio-mdev/mdpy.c | 81 +++++----
samples/vfio-mdev/mtty.c | 67 ++++---
23 files changed, 730 insertions(+), 381 deletions(-)
create mode 100644 Documentation/ABI/testing/sysfs-devices-vfio-dev


base-commit: f39856aacb078c1c93acef011a37121b17d54fe0
--
2.21.3


2022-09-09 03:08:12

by Tian, Kevin

[permalink] [raw]
Subject: [PATCH v3 10/15] vfio/fsl-mc: Use the new device life cycle helpers

From: Yi Liu <[email protected]>

Also add a comment to mark that vfio core releases device_set if @init
fails.

Signed-off-by: Yi Liu <[email protected]>
Signed-off-by: Kevin Tian <[email protected]>
Reviewed-by: Jason Gunthorpe <[email protected]>
---
drivers/vfio/fsl-mc/vfio_fsl_mc.c | 85 ++++++++++++++++++-------------
1 file changed, 49 insertions(+), 36 deletions(-)

diff --git a/drivers/vfio/fsl-mc/vfio_fsl_mc.c b/drivers/vfio/fsl-mc/vfio_fsl_mc.c
index 42b344bd7cd5..b16874e913e4 100644
--- a/drivers/vfio/fsl-mc/vfio_fsl_mc.c
+++ b/drivers/vfio/fsl-mc/vfio_fsl_mc.c
@@ -418,16 +418,7 @@ static int vfio_fsl_mc_mmap(struct vfio_device *core_vdev,
return vfio_fsl_mc_mmap_mmio(vdev->regions[index], vma);
}

-static const struct vfio_device_ops vfio_fsl_mc_ops = {
- .name = "vfio-fsl-mc",
- .open_device = vfio_fsl_mc_open_device,
- .close_device = vfio_fsl_mc_close_device,
- .ioctl = vfio_fsl_mc_ioctl,
- .read = vfio_fsl_mc_read,
- .write = vfio_fsl_mc_write,
- .mmap = vfio_fsl_mc_mmap,
-};
-
+static const struct vfio_device_ops vfio_fsl_mc_ops;
static int vfio_fsl_mc_bus_notifier(struct notifier_block *nb,
unsigned long action, void *data)
{
@@ -518,35 +509,43 @@ static void vfio_fsl_uninit_device(struct vfio_fsl_mc_device *vdev)
bus_unregister_notifier(&fsl_mc_bus_type, &vdev->nb);
}

-static int vfio_fsl_mc_probe(struct fsl_mc_device *mc_dev)
+static int vfio_fsl_mc_init_dev(struct vfio_device *core_vdev)
{
- struct vfio_fsl_mc_device *vdev;
- struct device *dev = &mc_dev->dev;
+ struct vfio_fsl_mc_device *vdev =
+ container_of(core_vdev, struct vfio_fsl_mc_device, vdev);
+ struct fsl_mc_device *mc_dev = to_fsl_mc_device(core_vdev->dev);
int ret;

- vdev = kzalloc(sizeof(*vdev), GFP_KERNEL);
- if (!vdev)
- return -ENOMEM;
-
- vfio_init_group_dev(&vdev->vdev, dev, &vfio_fsl_mc_ops);
vdev->mc_dev = mc_dev;
mutex_init(&vdev->igate);

if (is_fsl_mc_bus_dprc(mc_dev))
- ret = vfio_assign_device_set(&vdev->vdev, &mc_dev->dev);
+ ret = vfio_assign_device_set(core_vdev, &mc_dev->dev);
else
- ret = vfio_assign_device_set(&vdev->vdev, mc_dev->dev.parent);
- if (ret)
- goto out_uninit;
+ ret = vfio_assign_device_set(core_vdev, mc_dev->dev.parent);

- ret = vfio_fsl_mc_init_device(vdev);
if (ret)
- goto out_uninit;
+ return ret;
+
+ /* device_set is released by vfio core if @init fails */
+ return vfio_fsl_mc_init_device(vdev);
+}
+
+static int vfio_fsl_mc_probe(struct fsl_mc_device *mc_dev)
+{
+ struct vfio_fsl_mc_device *vdev;
+ struct device *dev = &mc_dev->dev;
+ int ret;
+
+ vdev = vfio_alloc_device(vfio_fsl_mc_device, vdev, dev,
+ &vfio_fsl_mc_ops);
+ if (IS_ERR(vdev))
+ return PTR_ERR(vdev);

ret = vfio_register_group_dev(&vdev->vdev);
if (ret) {
dev_err(dev, "VFIO_FSL_MC: Failed to add to vfio group\n");
- goto out_device;
+ goto out_put_vdev;
}

ret = vfio_fsl_mc_scan_container(mc_dev);
@@ -557,30 +556,44 @@ static int vfio_fsl_mc_probe(struct fsl_mc_device *mc_dev)

out_group_dev:
vfio_unregister_group_dev(&vdev->vdev);
-out_device:
- vfio_fsl_uninit_device(vdev);
-out_uninit:
- vfio_uninit_group_dev(&vdev->vdev);
- kfree(vdev);
+out_put_vdev:
+ vfio_put_device(&vdev->vdev);
return ret;
}

+static void vfio_fsl_mc_release_dev(struct vfio_device *core_vdev)
+{
+ struct vfio_fsl_mc_device *vdev =
+ container_of(core_vdev, struct vfio_fsl_mc_device, vdev);
+
+ vfio_fsl_uninit_device(vdev);
+ mutex_destroy(&vdev->igate);
+ vfio_free_device(core_vdev);
+}
+
static int vfio_fsl_mc_remove(struct fsl_mc_device *mc_dev)
{
struct device *dev = &mc_dev->dev;
struct vfio_fsl_mc_device *vdev = dev_get_drvdata(dev);

vfio_unregister_group_dev(&vdev->vdev);
- mutex_destroy(&vdev->igate);
-
dprc_remove_devices(mc_dev, NULL, 0);
- vfio_fsl_uninit_device(vdev);
-
- vfio_uninit_group_dev(&vdev->vdev);
- kfree(vdev);
+ vfio_put_device(&vdev->vdev);
return 0;
}

+static const struct vfio_device_ops vfio_fsl_mc_ops = {
+ .name = "vfio-fsl-mc",
+ .init = vfio_fsl_mc_init_dev,
+ .release = vfio_fsl_mc_release_dev,
+ .open_device = vfio_fsl_mc_open_device,
+ .close_device = vfio_fsl_mc_close_device,
+ .ioctl = vfio_fsl_mc_ioctl,
+ .read = vfio_fsl_mc_read,
+ .write = vfio_fsl_mc_write,
+ .mmap = vfio_fsl_mc_mmap,
+};
+
static struct fsl_mc_driver vfio_fsl_mc_driver = {
.probe = vfio_fsl_mc_probe,
.remove = vfio_fsl_mc_remove,
--
2.21.3

2022-09-09 03:16:20

by Tian, Kevin

[permalink] [raw]
Subject: [PATCH v3 09/15] vfio/ap: Use the new device life cycle helpers

From: Yi Liu <[email protected]>

and manage available_instances inside @init/@release.

Signed-off-by: Yi Liu <[email protected]>
Signed-off-by: Kevin Tian <[email protected]>
Reviewed-by: Tony Krowiak <[email protected]>
Reviewed-by: Jason Gunthorpe <[email protected]>
---
drivers/s390/crypto/vfio_ap_ops.c | 50 ++++++++++++++++++-------------
1 file changed, 29 insertions(+), 21 deletions(-)

diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
index 6c8c41fac4e1..803aadfd0876 100644
--- a/drivers/s390/crypto/vfio_ap_ops.c
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -684,42 +684,44 @@ static bool vfio_ap_mdev_filter_matrix(unsigned long *apm, unsigned long *aqm,
AP_DOMAINS);
}

-static int vfio_ap_mdev_probe(struct mdev_device *mdev)
+static int vfio_ap_mdev_init_dev(struct vfio_device *vdev)
{
- struct ap_matrix_mdev *matrix_mdev;
- int ret;
+ struct ap_matrix_mdev *matrix_mdev =
+ container_of(vdev, struct ap_matrix_mdev, vdev);

if ((atomic_dec_if_positive(&matrix_dev->available_instances) < 0))
return -EPERM;

- matrix_mdev = kzalloc(sizeof(*matrix_mdev), GFP_KERNEL);
- if (!matrix_mdev) {
- ret = -ENOMEM;
- goto err_dec_available;
- }
- vfio_init_group_dev(&matrix_mdev->vdev, &mdev->dev,
- &vfio_ap_matrix_dev_ops);
-
- matrix_mdev->mdev = mdev;
+ matrix_mdev->mdev = to_mdev_device(vdev->dev);
vfio_ap_matrix_init(&matrix_dev->info, &matrix_mdev->matrix);
matrix_mdev->pqap_hook = handle_pqap;
vfio_ap_matrix_init(&matrix_dev->info, &matrix_mdev->shadow_apcb);
hash_init(matrix_mdev->qtable.queues);

+ return 0;
+}
+
+static int vfio_ap_mdev_probe(struct mdev_device *mdev)
+{
+ struct ap_matrix_mdev *matrix_mdev;
+ int ret;
+
+ matrix_mdev = vfio_alloc_device(ap_matrix_mdev, vdev, &mdev->dev,
+ &vfio_ap_matrix_dev_ops);
+ if (IS_ERR(matrix_mdev))
+ return PTR_ERR(matrix_mdev);
+
ret = vfio_register_emulated_iommu_dev(&matrix_mdev->vdev);
if (ret)
- goto err_list;
+ goto err_put_vdev;
dev_set_drvdata(&mdev->dev, matrix_mdev);
mutex_lock(&matrix_dev->mdevs_lock);
list_add(&matrix_mdev->node, &matrix_dev->mdev_list);
mutex_unlock(&matrix_dev->mdevs_lock);
return 0;

-err_list:
- vfio_uninit_group_dev(&matrix_mdev->vdev);
- kfree(matrix_mdev);
-err_dec_available:
- atomic_inc(&matrix_dev->available_instances);
+err_put_vdev:
+ vfio_put_device(&matrix_mdev->vdev);
return ret;
}

@@ -766,6 +768,12 @@ static void vfio_ap_mdev_unlink_fr_queues(struct ap_matrix_mdev *matrix_mdev)
}
}

+static void vfio_ap_mdev_release_dev(struct vfio_device *vdev)
+{
+ vfio_free_device(vdev);
+ atomic_inc(&matrix_dev->available_instances);
+}
+
static void vfio_ap_mdev_remove(struct mdev_device *mdev)
{
struct ap_matrix_mdev *matrix_mdev = dev_get_drvdata(&mdev->dev);
@@ -779,9 +787,7 @@ static void vfio_ap_mdev_remove(struct mdev_device *mdev)
list_del(&matrix_mdev->node);
mutex_unlock(&matrix_dev->mdevs_lock);
mutex_unlock(&matrix_dev->guests_lock);
- vfio_uninit_group_dev(&matrix_mdev->vdev);
- kfree(matrix_mdev);
- atomic_inc(&matrix_dev->available_instances);
+ vfio_put_device(&matrix_mdev->vdev);
}

static ssize_t name_show(struct mdev_type *mtype,
@@ -1794,6 +1800,8 @@ static const struct attribute_group vfio_queue_attr_group = {
};

static const struct vfio_device_ops vfio_ap_matrix_dev_ops = {
+ .init = vfio_ap_mdev_init_dev,
+ .release = vfio_ap_mdev_release_dev,
.open_device = vfio_ap_mdev_open_device,
.close_device = vfio_ap_mdev_close_device,
.ioctl = vfio_ap_mdev_ioctl,
--
2.21.3

2022-09-09 03:17:29

by Tian, Kevin

[permalink] [raw]
Subject: [PATCH v3 02/15] vfio/pci: Use the new device life cycle helpers

From: Yi Liu <[email protected]>

Also introduce two pci core helpers as @init/@release for pci drivers:

- vfio_pci_core_init_dev()
- vfio_pci_core_release_dev()

Signed-off-by: Yi Liu <[email protected]>
Signed-off-by: Kevin Tian <[email protected]>
Reviewed-by: Jason Gunthorpe <[email protected]>
---
drivers/vfio/pci/vfio_pci.c | 20 +++++++++---------
drivers/vfio/pci/vfio_pci_core.c | 35 ++++++++++++++++++++++++++++++++
include/linux/vfio_pci_core.h | 2 ++
3 files changed, 47 insertions(+), 10 deletions(-)

diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index d9b5c03f8d5b..1d4919edfbde 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -127,6 +127,8 @@ static int vfio_pci_open_device(struct vfio_device *core_vdev)

static const struct vfio_device_ops vfio_pci_ops = {
.name = "vfio-pci",
+ .init = vfio_pci_core_init_dev,
+ .release = vfio_pci_core_release_dev,
.open_device = vfio_pci_open_device,
.close_device = vfio_pci_core_close_device,
.ioctl = vfio_pci_core_ioctl,
@@ -146,20 +148,19 @@ static int vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
if (vfio_pci_is_denylisted(pdev))
return -EINVAL;

- vdev = kzalloc(sizeof(*vdev), GFP_KERNEL);
- if (!vdev)
- return -ENOMEM;
- vfio_pci_core_init_device(vdev, pdev, &vfio_pci_ops);
+ vdev = vfio_alloc_device(vfio_pci_core_device, vdev, &pdev->dev,
+ &vfio_pci_ops);
+ if (IS_ERR(vdev))
+ return PTR_ERR(vdev);

dev_set_drvdata(&pdev->dev, vdev);
ret = vfio_pci_core_register_device(vdev);
if (ret)
- goto out_free;
+ goto out_put_vdev;
return 0;

-out_free:
- vfio_pci_core_uninit_device(vdev);
- kfree(vdev);
+out_put_vdev:
+ vfio_put_device(&vdev->vdev);
return ret;
}

@@ -168,8 +169,7 @@ static void vfio_pci_remove(struct pci_dev *pdev)
struct vfio_pci_core_device *vdev = dev_get_drvdata(&pdev->dev);

vfio_pci_core_unregister_device(vdev);
- vfio_pci_core_uninit_device(vdev);
- kfree(vdev);
+ vfio_put_device(&vdev->vdev);
}

static int vfio_pci_sriov_configure(struct pci_dev *pdev, int nr_virtfn)
diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
index 0a801aee2f2d..77d33739c6e8 100644
--- a/drivers/vfio/pci/vfio_pci_core.c
+++ b/drivers/vfio/pci/vfio_pci_core.c
@@ -2078,6 +2078,41 @@ static void vfio_pci_vga_uninit(struct vfio_pci_core_device *vdev)
VGA_RSRC_LEGACY_MEM);
}

+int vfio_pci_core_init_dev(struct vfio_device *core_vdev)
+{
+ struct vfio_pci_core_device *vdev =
+ container_of(core_vdev, struct vfio_pci_core_device, vdev);
+
+ vdev->pdev = to_pci_dev(core_vdev->dev);
+ vdev->irq_type = VFIO_PCI_NUM_IRQS;
+ mutex_init(&vdev->igate);
+ spin_lock_init(&vdev->irqlock);
+ mutex_init(&vdev->ioeventfds_lock);
+ INIT_LIST_HEAD(&vdev->dummy_resources_list);
+ INIT_LIST_HEAD(&vdev->ioeventfds_list);
+ mutex_init(&vdev->vma_lock);
+ INIT_LIST_HEAD(&vdev->vma_list);
+ INIT_LIST_HEAD(&vdev->sriov_pfs_item);
+ init_rwsem(&vdev->memory_lock);
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(vfio_pci_core_init_dev);
+
+void vfio_pci_core_release_dev(struct vfio_device *core_vdev)
+{
+ struct vfio_pci_core_device *vdev =
+ container_of(core_vdev, struct vfio_pci_core_device, vdev);
+
+ mutex_destroy(&vdev->igate);
+ mutex_destroy(&vdev->ioeventfds_lock);
+ mutex_destroy(&vdev->vma_lock);
+ kfree(vdev->region);
+ kfree(vdev->pm_save);
+ vfio_free_device(core_vdev);
+}
+EXPORT_SYMBOL_GPL(vfio_pci_core_release_dev);
+
void vfio_pci_core_init_device(struct vfio_pci_core_device *vdev,
struct pci_dev *pdev,
const struct vfio_device_ops *vfio_pci_ops)
diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h
index 089b603bcfdc..0499ea836058 100644
--- a/include/linux/vfio_pci_core.h
+++ b/include/linux/vfio_pci_core.h
@@ -109,6 +109,8 @@ void vfio_pci_core_close_device(struct vfio_device *core_vdev);
void vfio_pci_core_init_device(struct vfio_pci_core_device *vdev,
struct pci_dev *pdev,
const struct vfio_device_ops *vfio_pci_ops);
+int vfio_pci_core_init_dev(struct vfio_device *core_vdev);
+void vfio_pci_core_release_dev(struct vfio_device *core_vdev);
int vfio_pci_core_register_device(struct vfio_pci_core_device *vdev);
void vfio_pci_core_uninit_device(struct vfio_pci_core_device *vdev);
void vfio_pci_core_unregister_device(struct vfio_pci_core_device *vdev);
--
2.21.3

2022-09-09 03:19:34

by Tian, Kevin

[permalink] [raw]
Subject: [PATCH v3 14/15] vfio: Rename vfio_device_put() and vfio_device_try_get()

With the addition of vfio_put_device() now the names become confusing.

vfio_put_device() is clear from object life cycle p.o.v given kref.

vfio_device_put()/vfio_device_try_get() are helpers for tracking
users on a registered device.

Now rename them:

- vfio_device_put() -> vfio_device_put_registration()
- vfio_device_try_get() -> vfio_device_try_get_registration()

Signed-off-by: Kevin Tian <[email protected]>
Reviewed-by: Jason Gunthorpe <[email protected]>
Reviewed-by: Eric Auger <[email protected]>
---
drivers/vfio/vfio_main.c | 17 +++++++++--------
1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
index 5bc221730d7b..37cbd10f3faf 100644
--- a/drivers/vfio/vfio_main.c
+++ b/drivers/vfio/vfio_main.c
@@ -453,13 +453,13 @@ static void vfio_group_get(struct vfio_group *group)
* Device objects - create, release, get, put, search
*/
/* Device reference always implies a group reference */
-static void vfio_device_put(struct vfio_device *device)
+static void vfio_device_put_registration(struct vfio_device *device)
{
if (refcount_dec_and_test(&device->refcount))
complete(&device->comp);
}

-static bool vfio_device_try_get(struct vfio_device *device)
+static bool vfio_device_try_get_registration(struct vfio_device *device)
{
return refcount_inc_not_zero(&device->refcount);
}
@@ -471,7 +471,8 @@ static struct vfio_device *vfio_group_get_device(struct vfio_group *group,

mutex_lock(&group->device_lock);
list_for_each_entry(device, &group->device_list, group_next) {
- if (device->dev == dev && vfio_device_try_get(device)) {
+ if (device->dev == dev &&
+ vfio_device_try_get_registration(device)) {
mutex_unlock(&group->device_lock);
return device;
}
@@ -673,7 +674,7 @@ static int __vfio_register_dev(struct vfio_device *device,
if (existing_device) {
dev_WARN(device->dev, "Device already exists on group %d\n",
iommu_group_id(group->iommu_group));
- vfio_device_put(existing_device);
+ vfio_device_put_registration(existing_device);
if (group->type == VFIO_NO_IOMMU ||
group->type == VFIO_EMULATED_IOMMU)
iommu_group_remove_device(device->dev);
@@ -731,7 +732,7 @@ static struct vfio_device *vfio_device_get_from_name(struct vfio_group *group,
ret = !strcmp(dev_name(it->dev), buf);
}

- if (ret && vfio_device_try_get(it)) {
+ if (ret && vfio_device_try_get_registration(it)) {
device = it;
break;
}
@@ -751,7 +752,7 @@ void vfio_unregister_group_dev(struct vfio_device *device)
bool interrupted = false;
long rc;

- vfio_device_put(device);
+ vfio_device_put_registration(device);
rc = try_wait_for_completion(&device->comp);
while (rc <= 0) {
if (device->ops->request)
@@ -1311,7 +1312,7 @@ static int vfio_group_ioctl_get_device_fd(struct vfio_group *group,
err_put_fdno:
put_unused_fd(fdno);
err_put_device:
- vfio_device_put(device);
+ vfio_device_put_registration(device);
return ret;
}

@@ -1493,7 +1494,7 @@ static int vfio_device_fops_release(struct inode *inode, struct file *filep)

vfio_device_unassign_container(device);

- vfio_device_put(device);
+ vfio_device_put_registration(device);

return 0;
}
--
2.21.3

2022-09-09 03:24:18

by Tian, Kevin

[permalink] [raw]
Subject: [PATCH v3 08/15] drm/i915/gvt: Use the new device life cycle helpers

Move vfio_device to the start of intel_vgpu as required by the new
helpers.

Change intel_gvt_create_vgpu() to use intel_vgpu as the first param
as other vgpu helpers do.

Signed-off-by: Kevin Tian <[email protected]>
Reviewed-by: Jason Gunthorpe <[email protected]>
Reviewed-by: Zhenyu Wang <[email protected]>
---
drivers/gpu/drm/i915/gvt/gvt.h | 5 ++-
drivers/gpu/drm/i915/gvt/kvmgt.c | 52 ++++++++++++++++++++++----------
drivers/gpu/drm/i915/gvt/vgpu.c | 33 ++++++++------------
3 files changed, 50 insertions(+), 40 deletions(-)

diff --git a/drivers/gpu/drm/i915/gvt/gvt.h b/drivers/gpu/drm/i915/gvt/gvt.h
index 705689e64011..89fab7896fc6 100644
--- a/drivers/gpu/drm/i915/gvt/gvt.h
+++ b/drivers/gpu/drm/i915/gvt/gvt.h
@@ -172,6 +172,7 @@ struct intel_vgpu_submission {
#define KVMGT_DEBUGFS_FILENAME "kvmgt_nr_cache_entries"

struct intel_vgpu {
+ struct vfio_device vfio_device;
struct intel_gvt *gvt;
struct mutex vgpu_lock;
int id;
@@ -211,7 +212,6 @@ struct intel_vgpu {

u32 scan_nonprivbb;

- struct vfio_device vfio_device;
struct vfio_region *region;
int num_regions;
struct eventfd_ctx *intx_trigger;
@@ -494,8 +494,7 @@ void intel_gvt_clean_vgpu_types(struct intel_gvt *gvt);

struct intel_vgpu *intel_gvt_create_idle_vgpu(struct intel_gvt *gvt);
void intel_gvt_destroy_idle_vgpu(struct intel_vgpu *vgpu);
-struct intel_vgpu *intel_gvt_create_vgpu(struct intel_gvt *gvt,
- struct intel_vgpu_type *type);
+int intel_gvt_create_vgpu(struct intel_vgpu *vgpu, struct intel_vgpu_type *type);
void intel_gvt_destroy_vgpu(struct intel_vgpu *vgpu);
void intel_gvt_release_vgpu(struct intel_vgpu *vgpu);
void intel_gvt_reset_vgpu_locked(struct intel_vgpu *vgpu, bool dmlr,
diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c b/drivers/gpu/drm/i915/gvt/kvmgt.c
index e3cd58946477..41bba40feef8 100644
--- a/drivers/gpu/drm/i915/gvt/kvmgt.c
+++ b/drivers/gpu/drm/i915/gvt/kvmgt.c
@@ -1546,7 +1546,33 @@ static const struct attribute_group *intel_vgpu_groups[] = {
NULL,
};

+static int intel_vgpu_init_dev(struct vfio_device *vfio_dev)
+{
+ struct mdev_device *mdev = to_mdev_device(vfio_dev->dev);
+ struct device *pdev = mdev_parent_dev(mdev);
+ struct intel_gvt *gvt = kdev_to_i915(pdev)->gvt;
+ struct intel_vgpu_type *type;
+ struct intel_vgpu *vgpu = vfio_dev_to_vgpu(vfio_dev);
+
+ type = &gvt->types[mdev_get_type_group_id(mdev)];
+ if (!type)
+ return -EINVAL;
+
+ vgpu->gvt = gvt;
+ return intel_gvt_create_vgpu(vgpu, type);
+}
+
+static void intel_vgpu_release_dev(struct vfio_device *vfio_dev)
+{
+ struct intel_vgpu *vgpu = vfio_dev_to_vgpu(vfio_dev);
+
+ intel_gvt_destroy_vgpu(vgpu);
+ vfio_free_device(vfio_dev);
+}
+
static const struct vfio_device_ops intel_vgpu_dev_ops = {
+ .init = intel_vgpu_init_dev,
+ .release = intel_vgpu_release_dev,
.open_device = intel_vgpu_open_device,
.close_device = intel_vgpu_close_device,
.read = intel_vgpu_read,
@@ -1558,35 +1584,28 @@ static const struct vfio_device_ops intel_vgpu_dev_ops = {

static int intel_vgpu_probe(struct mdev_device *mdev)
{
- struct device *pdev = mdev_parent_dev(mdev);
- struct intel_gvt *gvt = kdev_to_i915(pdev)->gvt;
- struct intel_vgpu_type *type;
struct intel_vgpu *vgpu;
int ret;

- type = &gvt->types[mdev_get_type_group_id(mdev)];
- if (!type)
- return -EINVAL;
-
- vgpu = intel_gvt_create_vgpu(gvt, type);
+ vgpu = vfio_alloc_device(intel_vgpu, vfio_device, &mdev->dev,
+ &intel_vgpu_dev_ops);
if (IS_ERR(vgpu)) {
gvt_err("failed to create intel vgpu: %ld\n", PTR_ERR(vgpu));
return PTR_ERR(vgpu);
}

- vfio_init_group_dev(&vgpu->vfio_device, &mdev->dev,
- &intel_vgpu_dev_ops);
-
dev_set_drvdata(&mdev->dev, vgpu);
ret = vfio_register_emulated_iommu_dev(&vgpu->vfio_device);
- if (ret) {
- intel_gvt_destroy_vgpu(vgpu);
- return ret;
- }
+ if (ret)
+ goto out_put_vdev;

gvt_dbg_core("intel_vgpu_create succeeded for mdev: %s\n",
dev_name(mdev_dev(mdev)));
return 0;
+
+out_put_vdev:
+ vfio_put_device(&vgpu->vfio_device);
+ return ret;
}

static void intel_vgpu_remove(struct mdev_device *mdev)
@@ -1595,7 +1614,8 @@ static void intel_vgpu_remove(struct mdev_device *mdev)

if (WARN_ON_ONCE(vgpu->attached))
return;
- intel_gvt_destroy_vgpu(vgpu);
+
+ vfio_put_device(&vgpu->vfio_device);
}

static struct mdev_driver intel_vgpu_mdev_driver = {
diff --git a/drivers/gpu/drm/i915/gvt/vgpu.c b/drivers/gpu/drm/i915/gvt/vgpu.c
index 46da19b3225d..5c533fbc2c8d 100644
--- a/drivers/gpu/drm/i915/gvt/vgpu.c
+++ b/drivers/gpu/drm/i915/gvt/vgpu.c
@@ -302,8 +302,6 @@ void intel_gvt_destroy_vgpu(struct intel_vgpu *vgpu)
mutex_lock(&gvt->lock);
intel_gvt_update_vgpu_types(gvt);
mutex_unlock(&gvt->lock);
-
- vfree(vgpu);
}

#define IDLE_VGPU_IDR 0
@@ -363,28 +361,23 @@ void intel_gvt_destroy_idle_vgpu(struct intel_vgpu *vgpu)
vfree(vgpu);
}

-static struct intel_vgpu *__intel_gvt_create_vgpu(struct intel_gvt *gvt,
- struct intel_vgpu_creation_params *param)
+static int __intel_gvt_create_vgpu(struct intel_vgpu *vgpu,
+ struct intel_vgpu_creation_params *param)
{
+ struct intel_gvt *gvt = vgpu->gvt;
struct drm_i915_private *dev_priv = gvt->gt->i915;
- struct intel_vgpu *vgpu;
int ret;

gvt_dbg_core("low %llu MB high %llu MB fence %llu\n",
param->low_gm_sz, param->high_gm_sz,
param->fence_sz);

- vgpu = vzalloc(sizeof(*vgpu));
- if (!vgpu)
- return ERR_PTR(-ENOMEM);
-
ret = idr_alloc(&gvt->vgpu_idr, vgpu, IDLE_VGPU_IDR + 1, GVT_MAX_VGPU,
GFP_KERNEL);
if (ret < 0)
- goto out_free_vgpu;
+ return ret;

vgpu->id = ret;
- vgpu->gvt = gvt;
vgpu->sched_ctl.weight = param->weight;
mutex_init(&vgpu->vgpu_lock);
mutex_init(&vgpu->dmabuf_lock);
@@ -437,7 +430,7 @@ static struct intel_vgpu *__intel_gvt_create_vgpu(struct intel_gvt *gvt,
if (ret)
goto out_clean_sched_policy;

- return vgpu;
+ return 0;

out_clean_sched_policy:
intel_vgpu_clean_sched_policy(vgpu);
@@ -455,9 +448,7 @@ static struct intel_vgpu *__intel_gvt_create_vgpu(struct intel_gvt *gvt,
intel_vgpu_clean_mmio(vgpu);
out_clean_idr:
idr_remove(&gvt->vgpu_idr, vgpu->id);
-out_free_vgpu:
- vfree(vgpu);
- return ERR_PTR(ret);
+ return ret;
}

/**
@@ -470,11 +461,11 @@ static struct intel_vgpu *__intel_gvt_create_vgpu(struct intel_gvt *gvt,
* Returns:
* pointer to intel_vgpu, error pointer if failed.
*/
-struct intel_vgpu *intel_gvt_create_vgpu(struct intel_gvt *gvt,
- struct intel_vgpu_type *type)
+int intel_gvt_create_vgpu(struct intel_vgpu *vgpu, struct intel_vgpu_type *type)
{
+ struct intel_gvt *gvt = vgpu->gvt;
struct intel_vgpu_creation_params param;
- struct intel_vgpu *vgpu;
+ int ret;

param.primary = 1;
param.low_gm_sz = type->low_gm_size;
@@ -488,15 +479,15 @@ struct intel_vgpu *intel_gvt_create_vgpu(struct intel_gvt *gvt,
param.high_gm_sz = BYTES_TO_MB(param.high_gm_sz);

mutex_lock(&gvt->lock);
- vgpu = __intel_gvt_create_vgpu(gvt, &param);
- if (!IS_ERR(vgpu)) {
+ ret = __intel_gvt_create_vgpu(vgpu, &param);
+ if (!ret) {
/* calculate left instance change for types */
intel_gvt_update_vgpu_types(gvt);
intel_gvt_update_reg_whitelist(vgpu);
}
mutex_unlock(&gvt->lock);

- return vgpu;
+ return ret;
}

/**
--
2.21.3

2022-09-09 03:49:27

by Tian, Kevin

[permalink] [raw]
Subject: [PATCH v3 03/15] vfio/mlx5: Use the new device life cycle helpers

From: Yi Liu <[email protected]>

mlx5 has its own @init/@release for handling migration cap.

Signed-off-by: Yi Liu <[email protected]>
Signed-off-by: Kevin Tian <[email protected]>
Reviewed-by: Jason Gunthorpe <[email protected]>
---
drivers/vfio/pci/mlx5/main.c | 50 ++++++++++++++++++++++++++----------
1 file changed, 36 insertions(+), 14 deletions(-)

diff --git a/drivers/vfio/pci/mlx5/main.c b/drivers/vfio/pci/mlx5/main.c
index 759a5f5f7b3f..fd6ccb8454a2 100644
--- a/drivers/vfio/pci/mlx5/main.c
+++ b/drivers/vfio/pci/mlx5/main.c
@@ -585,8 +585,35 @@ static const struct vfio_log_ops mlx5vf_pci_log_ops = {
.log_read_and_clear = mlx5vf_tracker_read_and_clear,
};

+static int mlx5vf_pci_init_dev(struct vfio_device *core_vdev)
+{
+ struct mlx5vf_pci_core_device *mvdev = container_of(core_vdev,
+ struct mlx5vf_pci_core_device, core_device.vdev);
+ int ret;
+
+ ret = vfio_pci_core_init_dev(core_vdev);
+ if (ret)
+ return ret;
+
+ mlx5vf_cmd_set_migratable(mvdev, &mlx5vf_pci_mig_ops,
+ &mlx5vf_pci_log_ops);
+
+ return 0;
+}
+
+static void mlx5vf_pci_release_dev(struct vfio_device *core_vdev)
+{
+ struct mlx5vf_pci_core_device *mvdev = container_of(core_vdev,
+ struct mlx5vf_pci_core_device, core_device.vdev);
+
+ mlx5vf_cmd_remove_migratable(mvdev);
+ vfio_pci_core_release_dev(core_vdev);
+}
+
static const struct vfio_device_ops mlx5vf_pci_ops = {
.name = "mlx5-vfio-pci",
+ .init = mlx5vf_pci_init_dev,
+ .release = mlx5vf_pci_release_dev,
.open_device = mlx5vf_pci_open_device,
.close_device = mlx5vf_pci_close_device,
.ioctl = vfio_pci_core_ioctl,
@@ -604,22 +631,19 @@ static int mlx5vf_pci_probe(struct pci_dev *pdev,
struct mlx5vf_pci_core_device *mvdev;
int ret;

- mvdev = kzalloc(sizeof(*mvdev), GFP_KERNEL);
- if (!mvdev)
- return -ENOMEM;
- vfio_pci_core_init_device(&mvdev->core_device, pdev, &mlx5vf_pci_ops);
- mlx5vf_cmd_set_migratable(mvdev, &mlx5vf_pci_mig_ops,
- &mlx5vf_pci_log_ops);
+ mvdev = vfio_alloc_device(mlx5vf_pci_core_device, core_device.vdev,
+ &pdev->dev, &mlx5vf_pci_ops);
+ if (IS_ERR(mvdev))
+ return PTR_ERR(mvdev);
+
dev_set_drvdata(&pdev->dev, &mvdev->core_device);
ret = vfio_pci_core_register_device(&mvdev->core_device);
if (ret)
- goto out_free;
+ goto out_put_vdev;
return 0;

-out_free:
- mlx5vf_cmd_remove_migratable(mvdev);
- vfio_pci_core_uninit_device(&mvdev->core_device);
- kfree(mvdev);
+out_put_vdev:
+ vfio_put_device(&mvdev->core_device.vdev);
return ret;
}

@@ -628,9 +652,7 @@ static void mlx5vf_pci_remove(struct pci_dev *pdev)
struct mlx5vf_pci_core_device *mvdev = mlx5vf_drvdata(pdev);

vfio_pci_core_unregister_device(&mvdev->core_device);
- mlx5vf_cmd_remove_migratable(mvdev);
- vfio_pci_core_uninit_device(&mvdev->core_device);
- kfree(mvdev);
+ vfio_put_device(&mvdev->core_device.vdev);
}

static const struct pci_device_id mlx5vf_pci_table[] = {
--
2.21.3

2022-09-09 03:51:22

by Tian, Kevin

[permalink] [raw]
Subject: [PATCH v3 04/15] vfio/hisi_acc: Use the new device life cycle helpers

From: Yi Liu <[email protected]>

Tidy up @probe so all migration specific initialization logic is moved
to migration specific @init callback.

Remove vfio_pci_core_{un}init_device() given no user now.

Signed-off-by: Yi Liu <[email protected]>
Signed-off-by: Kevin Tian <[email protected]>
Reviewed-by: Jason Gunthorpe <[email protected]>
Reviewed-by: Shameer Kolothum <[email protected]>
---
.../vfio/pci/hisilicon/hisi_acc_vfio_pci.c | 80 +++++++++----------
drivers/vfio/pci/vfio_pci_core.c | 30 -------
include/linux/vfio_pci_core.h | 4 -
3 files changed, 37 insertions(+), 77 deletions(-)

diff --git a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
index 258cae0863ea..47174e2b61bd 100644
--- a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
+++ b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
@@ -1213,8 +1213,28 @@ static const struct vfio_migration_ops hisi_acc_vfio_pci_migrn_state_ops = {
.migration_get_state = hisi_acc_vfio_pci_get_device_state,
};

+static int hisi_acc_vfio_pci_migrn_init_dev(struct vfio_device *core_vdev)
+{
+ struct hisi_acc_vf_core_device *hisi_acc_vdev = container_of(core_vdev,
+ struct hisi_acc_vf_core_device, core_device.vdev);
+ struct pci_dev *pdev = to_pci_dev(core_vdev->dev);
+ struct hisi_qm *pf_qm = hisi_acc_get_pf_qm(pdev);
+
+ hisi_acc_vdev->vf_id = pci_iov_vf_id(pdev) + 1;
+ hisi_acc_vdev->pf_qm = pf_qm;
+ hisi_acc_vdev->vf_dev = pdev;
+ mutex_init(&hisi_acc_vdev->state_mutex);
+
+ core_vdev->migration_flags = VFIO_MIGRATION_STOP_COPY;
+ core_vdev->mig_ops = &hisi_acc_vfio_pci_migrn_state_ops;
+
+ return vfio_pci_core_init_dev(core_vdev);
+}
+
static const struct vfio_device_ops hisi_acc_vfio_pci_migrn_ops = {
.name = "hisi-acc-vfio-pci-migration",
+ .init = hisi_acc_vfio_pci_migrn_init_dev,
+ .release = vfio_pci_core_release_dev,
.open_device = hisi_acc_vfio_pci_open_device,
.close_device = hisi_acc_vfio_pci_close_device,
.ioctl = hisi_acc_vfio_pci_ioctl,
@@ -1228,6 +1248,8 @@ static const struct vfio_device_ops hisi_acc_vfio_pci_migrn_ops = {

static const struct vfio_device_ops hisi_acc_vfio_pci_ops = {
.name = "hisi-acc-vfio-pci",
+ .init = vfio_pci_core_init_dev,
+ .release = vfio_pci_core_release_dev,
.open_device = hisi_acc_vfio_pci_open_device,
.close_device = vfio_pci_core_close_device,
.ioctl = vfio_pci_core_ioctl,
@@ -1239,63 +1261,36 @@ static const struct vfio_device_ops hisi_acc_vfio_pci_ops = {
.match = vfio_pci_core_match,
};

-static int
-hisi_acc_vfio_pci_migrn_init(struct hisi_acc_vf_core_device *hisi_acc_vdev,
- struct pci_dev *pdev, struct hisi_qm *pf_qm)
-{
- int vf_id;
-
- vf_id = pci_iov_vf_id(pdev);
- if (vf_id < 0)
- return vf_id;
-
- hisi_acc_vdev->vf_id = vf_id + 1;
- hisi_acc_vdev->core_device.vdev.migration_flags =
- VFIO_MIGRATION_STOP_COPY;
- hisi_acc_vdev->pf_qm = pf_qm;
- hisi_acc_vdev->vf_dev = pdev;
- mutex_init(&hisi_acc_vdev->state_mutex);
-
- return 0;
-}
-
static int hisi_acc_vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
{
struct hisi_acc_vf_core_device *hisi_acc_vdev;
+ const struct vfio_device_ops *ops = &hisi_acc_vfio_pci_ops;
struct hisi_qm *pf_qm;
+ int vf_id;
int ret;

- hisi_acc_vdev = kzalloc(sizeof(*hisi_acc_vdev), GFP_KERNEL);
- if (!hisi_acc_vdev)
- return -ENOMEM;
-
pf_qm = hisi_acc_get_pf_qm(pdev);
if (pf_qm && pf_qm->ver >= QM_HW_V3) {
- ret = hisi_acc_vfio_pci_migrn_init(hisi_acc_vdev, pdev, pf_qm);
- if (!ret) {
- vfio_pci_core_init_device(&hisi_acc_vdev->core_device, pdev,
- &hisi_acc_vfio_pci_migrn_ops);
- hisi_acc_vdev->core_device.vdev.mig_ops =
- &hisi_acc_vfio_pci_migrn_state_ops;
- } else {
+ vf_id = pci_iov_vf_id(pdev);
+ if (vf_id >= 0)
+ ops = &hisi_acc_vfio_pci_migrn_ops;
+ else
pci_warn(pdev, "migration support failed, continue with generic interface\n");
- vfio_pci_core_init_device(&hisi_acc_vdev->core_device, pdev,
- &hisi_acc_vfio_pci_ops);
- }
- } else {
- vfio_pci_core_init_device(&hisi_acc_vdev->core_device, pdev,
- &hisi_acc_vfio_pci_ops);
}

+ hisi_acc_vdev = vfio_alloc_device(hisi_acc_vf_core_device,
+ core_device.vdev, &pdev->dev, ops);
+ if (IS_ERR(hisi_acc_vdev))
+ return PTR_ERR(hisi_acc_vdev);
+
dev_set_drvdata(&pdev->dev, &hisi_acc_vdev->core_device);
ret = vfio_pci_core_register_device(&hisi_acc_vdev->core_device);
if (ret)
- goto out_free;
+ goto out_put_vdev;
return 0;

-out_free:
- vfio_pci_core_uninit_device(&hisi_acc_vdev->core_device);
- kfree(hisi_acc_vdev);
+out_put_vdev:
+ vfio_put_device(&hisi_acc_vdev->core_device.vdev);
return ret;
}

@@ -1304,8 +1299,7 @@ static void hisi_acc_vfio_pci_remove(struct pci_dev *pdev)
struct hisi_acc_vf_core_device *hisi_acc_vdev = hisi_acc_drvdata(pdev);

vfio_pci_core_unregister_device(&hisi_acc_vdev->core_device);
- vfio_pci_core_uninit_device(&hisi_acc_vdev->core_device);
- kfree(hisi_acc_vdev);
+ vfio_put_device(&hisi_acc_vdev->core_device.vdev);
}

static const struct pci_device_id hisi_acc_vfio_pci_table[] = {
diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
index 77d33739c6e8..59a28251bb0b 100644
--- a/drivers/vfio/pci/vfio_pci_core.c
+++ b/drivers/vfio/pci/vfio_pci_core.c
@@ -2113,36 +2113,6 @@ void vfio_pci_core_release_dev(struct vfio_device *core_vdev)
}
EXPORT_SYMBOL_GPL(vfio_pci_core_release_dev);

-void vfio_pci_core_init_device(struct vfio_pci_core_device *vdev,
- struct pci_dev *pdev,
- const struct vfio_device_ops *vfio_pci_ops)
-{
- vfio_init_group_dev(&vdev->vdev, &pdev->dev, vfio_pci_ops);
- vdev->pdev = pdev;
- vdev->irq_type = VFIO_PCI_NUM_IRQS;
- mutex_init(&vdev->igate);
- spin_lock_init(&vdev->irqlock);
- mutex_init(&vdev->ioeventfds_lock);
- INIT_LIST_HEAD(&vdev->dummy_resources_list);
- INIT_LIST_HEAD(&vdev->ioeventfds_list);
- mutex_init(&vdev->vma_lock);
- INIT_LIST_HEAD(&vdev->vma_list);
- INIT_LIST_HEAD(&vdev->sriov_pfs_item);
- init_rwsem(&vdev->memory_lock);
-}
-EXPORT_SYMBOL_GPL(vfio_pci_core_init_device);
-
-void vfio_pci_core_uninit_device(struct vfio_pci_core_device *vdev)
-{
- mutex_destroy(&vdev->igate);
- mutex_destroy(&vdev->ioeventfds_lock);
- mutex_destroy(&vdev->vma_lock);
- vfio_uninit_group_dev(&vdev->vdev);
- kfree(vdev->region);
- kfree(vdev->pm_save);
-}
-EXPORT_SYMBOL_GPL(vfio_pci_core_uninit_device);
-
int vfio_pci_core_register_device(struct vfio_pci_core_device *vdev)
{
struct pci_dev *pdev = vdev->pdev;
diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h
index 0499ea836058..367fd79226a3 100644
--- a/include/linux/vfio_pci_core.h
+++ b/include/linux/vfio_pci_core.h
@@ -106,13 +106,9 @@ int vfio_pci_core_register_dev_region(struct vfio_pci_core_device *vdev,
void vfio_pci_core_set_params(bool nointxmask, bool is_disable_vga,
bool is_disable_idle_d3);
void vfio_pci_core_close_device(struct vfio_device *core_vdev);
-void vfio_pci_core_init_device(struct vfio_pci_core_device *vdev,
- struct pci_dev *pdev,
- const struct vfio_device_ops *vfio_pci_ops);
int vfio_pci_core_init_dev(struct vfio_device *core_vdev);
void vfio_pci_core_release_dev(struct vfio_device *core_vdev);
int vfio_pci_core_register_device(struct vfio_pci_core_device *vdev);
-void vfio_pci_core_uninit_device(struct vfio_pci_core_device *vdev);
void vfio_pci_core_unregister_device(struct vfio_pci_core_device *vdev);
extern const struct pci_error_handlers vfio_pci_core_err_handlers;
int vfio_pci_core_sriov_configure(struct vfio_pci_core_device *vdev,
--
2.21.3

2022-09-09 03:52:42

by Tian, Kevin

[permalink] [raw]
Subject: [PATCH v3 05/15] vfio/mdpy: Use the new device life cycle helpers

From: Yi Liu <[email protected]>

and manage mdpy_count inside @init/@release.

Signed-off-by: Yi Liu <[email protected]>
Signed-off-by: Kevin Tian <[email protected]>
Reviewed-by: Jason Gunthorpe <[email protected]>
---
samples/vfio-mdev/mdpy.c | 81 +++++++++++++++++++++++-----------------
1 file changed, 47 insertions(+), 34 deletions(-)

diff --git a/samples/vfio-mdev/mdpy.c b/samples/vfio-mdev/mdpy.c
index e8c46eb2e246..a07dac16d873 100644
--- a/samples/vfio-mdev/mdpy.c
+++ b/samples/vfio-mdev/mdpy.c
@@ -216,61 +216,77 @@ static int mdpy_reset(struct mdev_state *mdev_state)
return 0;
}

-static int mdpy_probe(struct mdev_device *mdev)
+static int mdpy_init_dev(struct vfio_device *vdev)
{
+ struct mdev_state *mdev_state =
+ container_of(vdev, struct mdev_state, vdev);
+ struct mdev_device *mdev = to_mdev_device(vdev->dev);
const struct mdpy_type *type =
&mdpy_types[mdev_get_type_group_id(mdev)];
- struct device *dev = mdev_dev(mdev);
- struct mdev_state *mdev_state;
u32 fbsize;
- int ret;
+ int ret = -ENOMEM;

if (mdpy_count >= max_devices)
- return -ENOMEM;
-
- mdev_state = kzalloc(sizeof(struct mdev_state), GFP_KERNEL);
- if (mdev_state == NULL)
- return -ENOMEM;
- vfio_init_group_dev(&mdev_state->vdev, &mdev->dev, &mdpy_dev_ops);
+ return ret;

mdev_state->vconfig = kzalloc(MDPY_CONFIG_SPACE_SIZE, GFP_KERNEL);
- if (mdev_state->vconfig == NULL) {
- ret = -ENOMEM;
- goto err_state;
- }
+ if (!mdev_state->vconfig)
+ return ret;

fbsize = roundup_pow_of_two(type->width * type->height * type->bytepp);

mdev_state->memblk = vmalloc_user(fbsize);
- if (!mdev_state->memblk) {
- ret = -ENOMEM;
- goto err_vconfig;
- }
- dev_info(dev, "%s: %s (%dx%d)\n", __func__, type->name, type->width,
- type->height);
+ if (!mdev_state->memblk)
+ goto out_vconfig;

mutex_init(&mdev_state->ops_lock);
mdev_state->mdev = mdev;
- mdev_state->type = type;
+ mdev_state->type = type;
mdev_state->memsize = fbsize;
mdpy_create_config_space(mdev_state);
mdpy_reset(mdev_state);

+ dev_info(vdev->dev, "%s: %s (%dx%d)\n", __func__, type->name, type->width,
+ type->height);
+
mdpy_count++;
+ return 0;
+
+out_vconfig:
+ kfree(mdev_state->vconfig);
+ return ret;
+}
+
+static int mdpy_probe(struct mdev_device *mdev)
+{
+ struct mdev_state *mdev_state;
+ int ret;
+
+ mdev_state = vfio_alloc_device(mdev_state, vdev, &mdev->dev,
+ &mdpy_dev_ops);
+ if (IS_ERR(mdev_state))
+ return PTR_ERR(mdev_state);

ret = vfio_register_emulated_iommu_dev(&mdev_state->vdev);
if (ret)
- goto err_mem;
+ goto err_put_vdev;
dev_set_drvdata(&mdev->dev, mdev_state);
return 0;
-err_mem:
+
+err_put_vdev:
+ vfio_put_device(&mdev_state->vdev);
+ return ret;
+}
+
+static void mdpy_release_dev(struct vfio_device *vdev)
+{
+ struct mdev_state *mdev_state =
+ container_of(vdev, struct mdev_state, vdev);
+
vfree(mdev_state->memblk);
-err_vconfig:
kfree(mdev_state->vconfig);
-err_state:
- vfio_uninit_group_dev(&mdev_state->vdev);
- kfree(mdev_state);
- return ret;
+ vfio_free_device(vdev);
+ mdpy_count--;
}

static void mdpy_remove(struct mdev_device *mdev)
@@ -280,12 +296,7 @@ static void mdpy_remove(struct mdev_device *mdev)
dev_info(&mdev->dev, "%s\n", __func__);

vfio_unregister_group_dev(&mdev_state->vdev);
- vfree(mdev_state->memblk);
- kfree(mdev_state->vconfig);
- vfio_uninit_group_dev(&mdev_state->vdev);
- kfree(mdev_state);
-
- mdpy_count--;
+ vfio_put_device(&mdev_state->vdev);
}

static ssize_t mdpy_read(struct vfio_device *vdev, char __user *buf,
@@ -708,6 +719,8 @@ static struct attribute_group *mdev_type_groups[] = {
};

static const struct vfio_device_ops mdpy_dev_ops = {
+ .init = mdpy_init_dev,
+ .release = mdpy_release_dev,
.read = mdpy_read,
.write = mdpy_write,
.ioctl = mdpy_ioctl,
--
2.21.3

2022-09-09 03:55:18

by Tian, Kevin

[permalink] [raw]
Subject: [PATCH v3 06/15] vfio/mtty: Use the new device life cycle helpers

From: Yi Liu <[email protected]>

and manage available ports inside @init/@release.

Signed-off-by: Yi Liu <[email protected]>
Signed-off-by: Kevin Tian <[email protected]>
Reviewed-by: Jason Gunthorpe <[email protected]>
---
samples/vfio-mdev/mtty.c | 67 +++++++++++++++++++++++-----------------
1 file changed, 39 insertions(+), 28 deletions(-)

diff --git a/samples/vfio-mdev/mtty.c b/samples/vfio-mdev/mtty.c
index f42a59ed2e3f..41301d50b247 100644
--- a/samples/vfio-mdev/mtty.c
+++ b/samples/vfio-mdev/mtty.c
@@ -703,9 +703,11 @@ static ssize_t mdev_access(struct mdev_state *mdev_state, u8 *buf, size_t count,
return ret;
}

-static int mtty_probe(struct mdev_device *mdev)
+static int mtty_init_dev(struct vfio_device *vdev)
{
- struct mdev_state *mdev_state;
+ struct mdev_state *mdev_state =
+ container_of(vdev, struct mdev_state, vdev);
+ struct mdev_device *mdev = to_mdev_device(vdev->dev);
int nr_ports = mdev_get_type_group_id(mdev) + 1;
int avail_ports = atomic_read(&mdev_avail_ports);
int ret;
@@ -716,58 +718,65 @@ static int mtty_probe(struct mdev_device *mdev)
} while (!atomic_try_cmpxchg(&mdev_avail_ports,
&avail_ports, avail_ports - nr_ports));

- mdev_state = kzalloc(sizeof(struct mdev_state), GFP_KERNEL);
- if (mdev_state == NULL) {
- ret = -ENOMEM;
- goto err_nr_ports;
- }
-
- vfio_init_group_dev(&mdev_state->vdev, &mdev->dev, &mtty_dev_ops);
-
mdev_state->nr_ports = nr_ports;
mdev_state->irq_index = -1;
mdev_state->s[0].max_fifo_size = MAX_FIFO_SIZE;
mdev_state->s[1].max_fifo_size = MAX_FIFO_SIZE;
mutex_init(&mdev_state->rxtx_lock);
- mdev_state->vconfig = kzalloc(MTTY_CONFIG_SPACE_SIZE, GFP_KERNEL);

- if (mdev_state->vconfig == NULL) {
+ mdev_state->vconfig = kzalloc(MTTY_CONFIG_SPACE_SIZE, GFP_KERNEL);
+ if (!mdev_state->vconfig) {
ret = -ENOMEM;
- goto err_state;
+ goto err_nr_ports;
}

mutex_init(&mdev_state->ops_lock);
mdev_state->mdev = mdev;
-
mtty_create_config_space(mdev_state);
+ return 0;
+
+err_nr_ports:
+ atomic_add(nr_ports, &mdev_avail_ports);
+ return ret;
+}
+
+static int mtty_probe(struct mdev_device *mdev)
+{
+ struct mdev_state *mdev_state;
+ int ret;
+
+ mdev_state = vfio_alloc_device(mdev_state, vdev, &mdev->dev,
+ &mtty_dev_ops);
+ if (IS_ERR(mdev_state))
+ return PTR_ERR(mdev_state);

ret = vfio_register_emulated_iommu_dev(&mdev_state->vdev);
if (ret)
- goto err_vconfig;
+ goto err_put_vdev;
dev_set_drvdata(&mdev->dev, mdev_state);
return 0;

-err_vconfig:
- kfree(mdev_state->vconfig);
-err_state:
- vfio_uninit_group_dev(&mdev_state->vdev);
- kfree(mdev_state);
-err_nr_ports:
- atomic_add(nr_ports, &mdev_avail_ports);
+err_put_vdev:
+ vfio_put_device(&mdev_state->vdev);
return ret;
}

+static void mtty_release_dev(struct vfio_device *vdev)
+{
+ struct mdev_state *mdev_state =
+ container_of(vdev, struct mdev_state, vdev);
+
+ kfree(mdev_state->vconfig);
+ vfio_free_device(vdev);
+ atomic_add(mdev_state->nr_ports, &mdev_avail_ports);
+}
+
static void mtty_remove(struct mdev_device *mdev)
{
struct mdev_state *mdev_state = dev_get_drvdata(&mdev->dev);
- int nr_ports = mdev_state->nr_ports;

vfio_unregister_group_dev(&mdev_state->vdev);
-
- kfree(mdev_state->vconfig);
- vfio_uninit_group_dev(&mdev_state->vdev);
- kfree(mdev_state);
- atomic_add(nr_ports, &mdev_avail_ports);
+ vfio_put_device(&mdev_state->vdev);
}

static int mtty_reset(struct mdev_state *mdev_state)
@@ -1287,6 +1296,8 @@ static struct attribute_group *mdev_type_groups[] = {

static const struct vfio_device_ops mtty_dev_ops = {
.name = "vfio-mtty",
+ .init = mtty_init_dev,
+ .release = mtty_release_dev,
.read = mtty_read,
.write = mtty_write,
.ioctl = mtty_ioctl,
--
2.21.3

2022-09-09 03:55:32

by Tian, Kevin

[permalink] [raw]
Subject: [PATCH v3 11/15] vfio/platform: Use the new device life cycle helpers

Move vfio_device_ops from platform core to platform drivers so device
specific init/cleanup can be added.

Introduce two new helpers vfio_platform_init/release_common() for the
use in driver @init/@release.

vfio_platform_probe/remove_common() will be deprecated.

Signed-off-by: Kevin Tian <[email protected]>
Reviewed-by: Jason Gunthorpe <[email protected]>
Reviewed-by: Eric Auger <[email protected]>
Tested-by: Eric Auger <[email protected]>
---
drivers/vfio/platform/vfio_platform.c | 66 +++++++++++++++----
drivers/vfio/platform/vfio_platform_common.c | 53 ++++++++++++---
drivers/vfio/platform/vfio_platform_private.h | 15 +++++
3 files changed, 111 insertions(+), 23 deletions(-)

diff --git a/drivers/vfio/platform/vfio_platform.c b/drivers/vfio/platform/vfio_platform.c
index 04f40c5acfd6..82cedcebfd90 100644
--- a/drivers/vfio/platform/vfio_platform.c
+++ b/drivers/vfio/platform/vfio_platform.c
@@ -7,6 +7,7 @@
#include <linux/module.h>
#include <linux/slab.h>
#include <linux/vfio.h>
+#include <linux/pm_runtime.h>
#include <linux/platform_device.h>

#include "vfio_platform_private.h"
@@ -36,14 +37,11 @@ static int get_platform_irq(struct vfio_platform_device *vdev, int i)
return platform_get_irq_optional(pdev, i);
}

-static int vfio_platform_probe(struct platform_device *pdev)
+static int vfio_platform_init_dev(struct vfio_device *core_vdev)
{
- struct vfio_platform_device *vdev;
- int ret;
-
- vdev = kzalloc(sizeof(*vdev), GFP_KERNEL);
- if (!vdev)
- return -ENOMEM;
+ struct vfio_platform_device *vdev =
+ container_of(core_vdev, struct vfio_platform_device, vdev);
+ struct platform_device *pdev = to_platform_device(core_vdev->dev);

vdev->opaque = (void *) pdev;
vdev->name = pdev->name;
@@ -52,24 +50,64 @@ static int vfio_platform_probe(struct platform_device *pdev)
vdev->get_irq = get_platform_irq;
vdev->reset_required = reset_required;

- ret = vfio_platform_probe_common(vdev, &pdev->dev);
- if (ret) {
- kfree(vdev);
- return ret;
- }
+ return vfio_platform_init_common(vdev);
+}
+
+static const struct vfio_device_ops vfio_platform_ops;
+static int vfio_platform_probe(struct platform_device *pdev)
+{
+ struct vfio_platform_device *vdev;
+ int ret;
+
+ vdev = vfio_alloc_device(vfio_platform_device, vdev, &pdev->dev,
+ &vfio_platform_ops);
+ if (IS_ERR(vdev))
+ return PTR_ERR(vdev);
+
+ ret = vfio_register_group_dev(&vdev->vdev);
+ if (ret)
+ goto out_put_vdev;
+
+ pm_runtime_enable(&pdev->dev);
dev_set_drvdata(&pdev->dev, vdev);
return 0;
+
+out_put_vdev:
+ vfio_put_device(&vdev->vdev);
+ return ret;
+}
+
+static void vfio_platform_release_dev(struct vfio_device *core_vdev)
+{
+ struct vfio_platform_device *vdev =
+ container_of(core_vdev, struct vfio_platform_device, vdev);
+
+ vfio_platform_release_common(vdev);
+ vfio_free_device(core_vdev);
}

static int vfio_platform_remove(struct platform_device *pdev)
{
struct vfio_platform_device *vdev = dev_get_drvdata(&pdev->dev);

- vfio_platform_remove_common(vdev);
- kfree(vdev);
+ vfio_unregister_group_dev(&vdev->vdev);
+ pm_runtime_disable(vdev->device);
+ vfio_put_device(&vdev->vdev);
return 0;
}

+static const struct vfio_device_ops vfio_platform_ops = {
+ .name = "vfio-platform",
+ .init = vfio_platform_init_dev,
+ .release = vfio_platform_release_dev,
+ .open_device = vfio_platform_open_device,
+ .close_device = vfio_platform_close_device,
+ .ioctl = vfio_platform_ioctl,
+ .read = vfio_platform_read,
+ .write = vfio_platform_write,
+ .mmap = vfio_platform_mmap,
+};
+
static struct platform_driver vfio_platform_driver = {
.probe = vfio_platform_probe,
.remove = vfio_platform_remove,
diff --git a/drivers/vfio/platform/vfio_platform_common.c b/drivers/vfio/platform/vfio_platform_common.c
index 256f55b84e70..4c01bf0adebb 100644
--- a/drivers/vfio/platform/vfio_platform_common.c
+++ b/drivers/vfio/platform/vfio_platform_common.c
@@ -218,7 +218,7 @@ static int vfio_platform_call_reset(struct vfio_platform_device *vdev,
return -EINVAL;
}

-static void vfio_platform_close_device(struct vfio_device *core_vdev)
+void vfio_platform_close_device(struct vfio_device *core_vdev)
{
struct vfio_platform_device *vdev =
container_of(core_vdev, struct vfio_platform_device, vdev);
@@ -236,8 +236,9 @@ static void vfio_platform_close_device(struct vfio_device *core_vdev)
vfio_platform_regions_cleanup(vdev);
vfio_platform_irq_cleanup(vdev);
}
+EXPORT_SYMBOL_GPL(vfio_platform_close_device);

-static int vfio_platform_open_device(struct vfio_device *core_vdev)
+int vfio_platform_open_device(struct vfio_device *core_vdev)
{
struct vfio_platform_device *vdev =
container_of(core_vdev, struct vfio_platform_device, vdev);
@@ -273,9 +274,10 @@ static int vfio_platform_open_device(struct vfio_device *core_vdev)
vfio_platform_regions_cleanup(vdev);
return ret;
}
+EXPORT_SYMBOL_GPL(vfio_platform_open_device);

-static long vfio_platform_ioctl(struct vfio_device *core_vdev,
- unsigned int cmd, unsigned long arg)
+long vfio_platform_ioctl(struct vfio_device *core_vdev,
+ unsigned int cmd, unsigned long arg)
{
struct vfio_platform_device *vdev =
container_of(core_vdev, struct vfio_platform_device, vdev);
@@ -382,6 +384,7 @@ static long vfio_platform_ioctl(struct vfio_device *core_vdev,

return -ENOTTY;
}
+EXPORT_SYMBOL_GPL(vfio_platform_ioctl);

static ssize_t vfio_platform_read_mmio(struct vfio_platform_region *reg,
char __user *buf, size_t count,
@@ -438,8 +441,8 @@ static ssize_t vfio_platform_read_mmio(struct vfio_platform_region *reg,
return -EFAULT;
}

-static ssize_t vfio_platform_read(struct vfio_device *core_vdev,
- char __user *buf, size_t count, loff_t *ppos)
+ssize_t vfio_platform_read(struct vfio_device *core_vdev,
+ char __user *buf, size_t count, loff_t *ppos)
{
struct vfio_platform_device *vdev =
container_of(core_vdev, struct vfio_platform_device, vdev);
@@ -460,6 +463,7 @@ static ssize_t vfio_platform_read(struct vfio_device *core_vdev,

return -EINVAL;
}
+EXPORT_SYMBOL_GPL(vfio_platform_read);

static ssize_t vfio_platform_write_mmio(struct vfio_platform_region *reg,
const char __user *buf, size_t count,
@@ -515,8 +519,8 @@ static ssize_t vfio_platform_write_mmio(struct vfio_platform_region *reg,
return -EFAULT;
}

-static ssize_t vfio_platform_write(struct vfio_device *core_vdev, const char __user *buf,
- size_t count, loff_t *ppos)
+ssize_t vfio_platform_write(struct vfio_device *core_vdev, const char __user *buf,
+ size_t count, loff_t *ppos)
{
struct vfio_platform_device *vdev =
container_of(core_vdev, struct vfio_platform_device, vdev);
@@ -537,6 +541,7 @@ static ssize_t vfio_platform_write(struct vfio_device *core_vdev, const char __u

return -EINVAL;
}
+EXPORT_SYMBOL_GPL(vfio_platform_write);

static int vfio_platform_mmap_mmio(struct vfio_platform_region region,
struct vm_area_struct *vma)
@@ -558,7 +563,7 @@ static int vfio_platform_mmap_mmio(struct vfio_platform_region region,
req_len, vma->vm_page_prot);
}

-static int vfio_platform_mmap(struct vfio_device *core_vdev, struct vm_area_struct *vma)
+int vfio_platform_mmap(struct vfio_device *core_vdev, struct vm_area_struct *vma)
{
struct vfio_platform_device *vdev =
container_of(core_vdev, struct vfio_platform_device, vdev);
@@ -598,6 +603,7 @@ static int vfio_platform_mmap(struct vfio_device *core_vdev, struct vm_area_stru

return -EINVAL;
}
+EXPORT_SYMBOL_GPL(vfio_platform_mmap);

static const struct vfio_device_ops vfio_platform_ops = {
.name = "vfio-platform",
@@ -639,6 +645,35 @@ static int vfio_platform_of_probe(struct vfio_platform_device *vdev,
* If the firmware is ACPI type, then acpi_disabled is 0. All other checks are
* valid checks. We cannot claim that this system is DT.
*/
+int vfio_platform_init_common(struct vfio_platform_device *vdev)
+{
+ int ret;
+ struct device *dev = vdev->vdev.dev;
+
+ ret = vfio_platform_acpi_probe(vdev, dev);
+ if (ret)
+ ret = vfio_platform_of_probe(vdev, dev);
+
+ if (ret)
+ return ret;
+
+ vdev->device = dev;
+ mutex_init(&vdev->igate);
+
+ ret = vfio_platform_get_reset(vdev);
+ if (ret && vdev->reset_required)
+ dev_err(dev, "No reset function found for device %s\n",
+ vdev->name);
+ return ret;
+}
+EXPORT_SYMBOL_GPL(vfio_platform_init_common);
+
+void vfio_platform_release_common(struct vfio_platform_device *vdev)
+{
+ vfio_platform_put_reset(vdev);
+}
+EXPORT_SYMBOL_GPL(vfio_platform_release_common);
+
int vfio_platform_probe_common(struct vfio_platform_device *vdev,
struct device *dev)
{
diff --git a/drivers/vfio/platform/vfio_platform_private.h b/drivers/vfio/platform/vfio_platform_private.h
index 691b43f4b2b2..a769d649fb97 100644
--- a/drivers/vfio/platform/vfio_platform_private.h
+++ b/drivers/vfio/platform/vfio_platform_private.h
@@ -81,6 +81,21 @@ struct vfio_platform_reset_node {
int vfio_platform_probe_common(struct vfio_platform_device *vdev,
struct device *dev);
void vfio_platform_remove_common(struct vfio_platform_device *vdev);
+int vfio_platform_init_common(struct vfio_platform_device *vdev);
+void vfio_platform_release_common(struct vfio_platform_device *vdev);
+
+int vfio_platform_open_device(struct vfio_device *core_vdev);
+void vfio_platform_close_device(struct vfio_device *core_vdev);
+long vfio_platform_ioctl(struct vfio_device *core_vdev,
+ unsigned int cmd, unsigned long arg);
+ssize_t vfio_platform_read(struct vfio_device *core_vdev,
+ char __user *buf, size_t count,
+ loff_t *ppos);
+ssize_t vfio_platform_write(struct vfio_device *core_vdev,
+ const char __user *buf,
+ size_t count, loff_t *ppos);
+int vfio_platform_mmap(struct vfio_device *core_vdev,
+ struct vm_area_struct *vma);

int vfio_platform_irq_init(struct vfio_platform_device *vdev);
void vfio_platform_irq_cleanup(struct vfio_platform_device *vdev);
--
2.21.3

2022-09-09 03:55:40

by Tian, Kevin

[permalink] [raw]
Subject: [PATCH v3 13/15] vfio/ccw: Use the new device life cycle helpers

ccw is the only exception which cannot use vfio_alloc_device() because
its private device structure is designed to serve both mdev and parent.
Life cycle of the parent is managed by css_driver so vfio_ccw_private
must be allocated/freed in css_driver probe/remove path instead of
conforming to vfio core life cycle for mdev.

Given that use a wait/completion scheme so the mdev remove path waits
after vfio_put_device() until receiving a completion notification from
@release. The completion indicates that all active references on
vfio_device have been released.

After that point although free of vfio_ccw_private is delayed to
css_driver it's at least guaranteed to have no parallel reference on
released vfio device part from other code paths.

memset() in @probe is removed. vfio_device is either already cleared
when probed for the first time or cleared in @release from last probe.

The right fix is to introduce separate structures for mdev and parent,
but this won't happen in short term per prior discussions.

Remove vfio_init/uninit_group_dev() as no user now.

Suggested-by: Jason Gunthorpe <[email protected]>
Signed-off-by: Kevin Tian <[email protected]>
Reviewed-by: Jason Gunthorpe <[email protected]>
Reviewed-by: Eric Farman <[email protected]>
---
drivers/s390/cio/vfio_ccw_ops.c | 52 +++++++++++++++++++++++++----
drivers/s390/cio/vfio_ccw_private.h | 3 ++
drivers/vfio/vfio_main.c | 23 +++----------
include/linux/vfio.h | 3 --
4 files changed, 53 insertions(+), 28 deletions(-)

diff --git a/drivers/s390/cio/vfio_ccw_ops.c b/drivers/s390/cio/vfio_ccw_ops.c
index 4a806a2273b5..9f8486c0d3d3 100644
--- a/drivers/s390/cio/vfio_ccw_ops.c
+++ b/drivers/s390/cio/vfio_ccw_ops.c
@@ -87,6 +87,15 @@ static struct attribute_group *mdev_type_groups[] = {
NULL,
};

+static int vfio_ccw_mdev_init_dev(struct vfio_device *vdev)
+{
+ struct vfio_ccw_private *private =
+ container_of(vdev, struct vfio_ccw_private, vdev);
+
+ init_completion(&private->release_comp);
+ return 0;
+}
+
static int vfio_ccw_mdev_probe(struct mdev_device *mdev)
{
struct vfio_ccw_private *private = dev_get_drvdata(mdev->dev.parent);
@@ -98,9 +107,9 @@ static int vfio_ccw_mdev_probe(struct mdev_device *mdev)
if (atomic_dec_if_positive(&private->avail) < 0)
return -EPERM;

- memset(&private->vdev, 0, sizeof(private->vdev));
- vfio_init_group_dev(&private->vdev, &mdev->dev,
- &vfio_ccw_dev_ops);
+ ret = vfio_init_device(&private->vdev, &mdev->dev, &vfio_ccw_dev_ops);
+ if (ret)
+ return ret;

VFIO_CCW_MSG_EVENT(2, "sch %x.%x.%04x: create\n",
private->sch->schid.cssid,
@@ -109,16 +118,33 @@ static int vfio_ccw_mdev_probe(struct mdev_device *mdev)

ret = vfio_register_emulated_iommu_dev(&private->vdev);
if (ret)
- goto err_atomic;
+ goto err_put_vdev;
dev_set_drvdata(&mdev->dev, private);
return 0;

-err_atomic:
- vfio_uninit_group_dev(&private->vdev);
+err_put_vdev:
+ vfio_put_device(&private->vdev);
atomic_inc(&private->avail);
return ret;
}

+static void vfio_ccw_mdev_release_dev(struct vfio_device *vdev)
+{
+ struct vfio_ccw_private *private =
+ container_of(vdev, struct vfio_ccw_private, vdev);
+
+ /*
+ * We cannot free vfio_ccw_private here because it includes
+ * parent info which must be free'ed by css driver.
+ *
+ * Use a workaround by memset'ing the core device part and
+ * then notifying the remove path that all active references
+ * to this device have been released.
+ */
+ memset(vdev, 0, sizeof(*vdev));
+ complete(&private->release_comp);
+}
+
static void vfio_ccw_mdev_remove(struct mdev_device *mdev)
{
struct vfio_ccw_private *private = dev_get_drvdata(mdev->dev.parent);
@@ -130,7 +156,17 @@ static void vfio_ccw_mdev_remove(struct mdev_device *mdev)

vfio_unregister_group_dev(&private->vdev);

- vfio_uninit_group_dev(&private->vdev);
+ vfio_put_device(&private->vdev);
+ /*
+ * Wait for all active references on mdev are released so it
+ * is safe to defer kfree() to a later point.
+ *
+ * TODO: the clean fix is to split parent/mdev info from ccw
+ * private structure so each can be managed in its own life
+ * cycle.
+ */
+ wait_for_completion(&private->release_comp);
+
atomic_inc(&private->avail);
}

@@ -592,6 +628,8 @@ static void vfio_ccw_mdev_request(struct vfio_device *vdev, unsigned int count)
}

static const struct vfio_device_ops vfio_ccw_dev_ops = {
+ .init = vfio_ccw_mdev_init_dev,
+ .release = vfio_ccw_mdev_release_dev,
.open_device = vfio_ccw_mdev_open_device,
.close_device = vfio_ccw_mdev_close_device,
.read = vfio_ccw_mdev_read,
diff --git a/drivers/s390/cio/vfio_ccw_private.h b/drivers/s390/cio/vfio_ccw_private.h
index cd24b7fada91..63d9202b29c7 100644
--- a/drivers/s390/cio/vfio_ccw_private.h
+++ b/drivers/s390/cio/vfio_ccw_private.h
@@ -88,6 +88,7 @@ struct vfio_ccw_crw {
* @req_trigger: eventfd ctx for signaling userspace to return device
* @io_work: work for deferral process of I/O handling
* @crw_work: work for deferral process of CRW handling
+ * @release_comp: synchronization helper for vfio device release
*/
struct vfio_ccw_private {
struct vfio_device vdev;
@@ -113,6 +114,8 @@ struct vfio_ccw_private {
struct eventfd_ctx *req_trigger;
struct work_struct io_work;
struct work_struct crw_work;
+
+ struct completion release_comp;
} __aligned(8);

int vfio_ccw_sch_quiesce(struct subchannel *sch);
diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
index adc1b697bb78..5bc221730d7b 100644
--- a/drivers/vfio/vfio_main.c
+++ b/drivers/vfio/vfio_main.c
@@ -483,28 +483,13 @@ static struct vfio_device *vfio_group_get_device(struct vfio_group *group,
/*
* VFIO driver API
*/
-void vfio_init_group_dev(struct vfio_device *device, struct device *dev,
- const struct vfio_device_ops *ops)
-{
- init_completion(&device->comp);
- device->dev = dev;
- device->ops = ops;
-}
-EXPORT_SYMBOL_GPL(vfio_init_group_dev);
-
-void vfio_uninit_group_dev(struct vfio_device *device)
-{
- vfio_release_device_set(device);
-}
-EXPORT_SYMBOL_GPL(vfio_uninit_group_dev);
-
/* Release helper called by vfio_put_device() */
void vfio_device_release(struct kref *kref)
{
struct vfio_device *device =
container_of(kref, struct vfio_device, kref);

- vfio_uninit_group_dev(device);
+ vfio_release_device_set(device);

/*
* kvfree() cannot be done here due to a life cycle mess in
@@ -562,7 +547,9 @@ int vfio_init_device(struct vfio_device *device, struct device *dev,
{
int ret;

- vfio_init_group_dev(device, dev, ops);
+ init_completion(&device->comp);
+ device->dev = dev;
+ device->ops = ops;

if (ops->init) {
ret = ops->init(device);
@@ -574,7 +561,7 @@ int vfio_init_device(struct vfio_device *device, struct device *dev,
return 0;

out_uninit:
- vfio_uninit_group_dev(device);
+ vfio_release_device_set(device);
return ret;
}
EXPORT_SYMBOL_GPL(vfio_init_device);
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index f67cac700e6f..3cf857b1eec7 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -184,9 +184,6 @@ static inline void vfio_put_device(struct vfio_device *device)
kref_put(&device->kref, vfio_device_release);
}

-void vfio_init_group_dev(struct vfio_device *device, struct device *dev,
- const struct vfio_device_ops *ops);
-void vfio_uninit_group_dev(struct vfio_device *device);
int vfio_register_group_dev(struct vfio_device *device);
int vfio_register_emulated_iommu_dev(struct vfio_device *device);
void vfio_unregister_group_dev(struct vfio_device *device);
--
2.21.3

2022-09-09 03:56:59

by Tian, Kevin

[permalink] [raw]
Subject: [PATCH v3 01/15] vfio: Add helpers for unifying vfio_device life cycle

The idea is to let vfio core manage the vfio_device life cycle instead
of duplicating the logic cross drivers. This is also a preparatory
step for adding struct device into vfio_device.

New pair of helpers together with a kref in vfio_device:

- vfio_alloc_device()
- vfio_put_device()

Drivers can register @init/@release callbacks to manage any private
state wrapping the vfio_device.

However vfio-ccw doesn't fit this model due to a life cycle mess
that its private structure mixes both parent and mdev info hence must
be allocated/freed outside of the life cycle of vfio device.

Per prior discussions this won't be fixed in short term by IBM folks.

Instead of waiting for those modifications introduce another helper
vfio_init_device() so ccw can call it to initialize a pre-allocated
vfio_device.

Further implication of the ccw trick is that vfio_device cannot be
freed uniformly in vfio core. Instead, require *EVERY* driver to
implement @release and free vfio_device inside. Then ccw can choose
to delay the free at its own discretion.

Another trick down the road is that kvzalloc() is used to accommodate
the need of gvt which uses vzalloc() while all others use kzalloc().
So drivers should call a helper vfio_free_device() to free the
vfio_device instead of assuming that kfree() or vfree() is appliable.

Later once the ccw mess is fixed we can remove those tricks and
fully handle structure alloc/free in vfio core.

Existing vfio_{un}init_group_dev() will be deprecated after all
existing usages are converted to the new model.

Suggested-by: Jason Gunthorpe <[email protected]>
Co-developed-by: Yi Liu <[email protected]>
Signed-off-by: Yi Liu <[email protected]>
Signed-off-by: Kevin Tian <[email protected]>
Reviewed-by: Tony Krowiak <[email protected]>
Reviewed-by: Jason Gunthorpe <[email protected]>
Reviewed-by: Eric Auger <[email protected]>
---
drivers/vfio/vfio_main.c | 92 ++++++++++++++++++++++++++++++++++++++++
include/linux/vfio.h | 25 ++++++++++-
2 files changed, 116 insertions(+), 1 deletion(-)

diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
index 27d9186f35d5..adc1b697bb78 100644
--- a/drivers/vfio/vfio_main.c
+++ b/drivers/vfio/vfio_main.c
@@ -498,6 +498,98 @@ void vfio_uninit_group_dev(struct vfio_device *device)
}
EXPORT_SYMBOL_GPL(vfio_uninit_group_dev);

+/* Release helper called by vfio_put_device() */
+void vfio_device_release(struct kref *kref)
+{
+ struct vfio_device *device =
+ container_of(kref, struct vfio_device, kref);
+
+ vfio_uninit_group_dev(device);
+
+ /*
+ * kvfree() cannot be done here due to a life cycle mess in
+ * vfio-ccw. Before the ccw part is fixed all drivers are
+ * required to support @release and call vfio_free_device()
+ * from there.
+ */
+ device->ops->release(device);
+}
+EXPORT_SYMBOL_GPL(vfio_device_release);
+
+/*
+ * Alloc and initialize vfio_device so it can be registered to vfio
+ * core.
+ *
+ * Drivers should use the wrapper vfio_alloc_device() for allocation.
+ * @size is the size of the structure to be allocated, including any
+ * private data used by the driver.
+ *
+ * Driver may provide an @init callback to cover device private data.
+ *
+ * Use vfio_put_device() to release the structure after success return.
+ */
+struct vfio_device *_vfio_alloc_device(size_t size, struct device *dev,
+ const struct vfio_device_ops *ops)
+{
+ struct vfio_device *device;
+ int ret;
+
+ if (WARN_ON(size < sizeof(struct vfio_device)))
+ return ERR_PTR(-EINVAL);
+
+ device = kvzalloc(size, GFP_KERNEL);
+ if (!device)
+ return ERR_PTR(-ENOMEM);
+
+ ret = vfio_init_device(device, dev, ops);
+ if (ret)
+ goto out_free;
+ return device;
+
+out_free:
+ kvfree(device);
+ return ERR_PTR(ret);
+}
+EXPORT_SYMBOL_GPL(_vfio_alloc_device);
+
+/*
+ * Initialize a vfio_device so it can be registered to vfio core.
+ *
+ * Only vfio-ccw driver should call this interface.
+ */
+int vfio_init_device(struct vfio_device *device, struct device *dev,
+ const struct vfio_device_ops *ops)
+{
+ int ret;
+
+ vfio_init_group_dev(device, dev, ops);
+
+ if (ops->init) {
+ ret = ops->init(device);
+ if (ret)
+ goto out_uninit;
+ }
+
+ kref_init(&device->kref);
+ return 0;
+
+out_uninit:
+ vfio_uninit_group_dev(device);
+ return ret;
+}
+EXPORT_SYMBOL_GPL(vfio_init_device);
+
+/*
+ * The helper called by driver @release callback to free the device
+ * structure. Drivers which don't have private data to clean can
+ * simply use this helper as its @release.
+ */
+void vfio_free_device(struct vfio_device *device)
+{
+ kvfree(device);
+}
+EXPORT_SYMBOL_GPL(vfio_free_device);
+
static struct vfio_group *vfio_noiommu_group_alloc(struct device *dev,
enum vfio_group_type type)
{
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index 0e2826559091..f67cac700e6f 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -47,7 +47,8 @@ struct vfio_device {
struct kvm *kvm;

/* Members below here are private, not for driver use */
- refcount_t refcount;
+ struct kref kref; /* object life cycle */
+ refcount_t refcount; /* user count on registered device*/
unsigned int open_count;
struct completion comp;
struct list_head group_next;
@@ -57,6 +58,8 @@ struct vfio_device {
/**
* struct vfio_device_ops - VFIO bus driver device callbacks
*
+ * @init: initialize private fields in device structure
+ * @release: Reclaim private fields in device structure
* @open_device: Called when the first file descriptor is opened for this device
* @close_device: Opposite of open_device
* @read: Perform read(2) on device file descriptor
@@ -74,6 +77,8 @@ struct vfio_device {
*/
struct vfio_device_ops {
char *name;
+ int (*init)(struct vfio_device *vdev);
+ void (*release)(struct vfio_device *vdev);
int (*open_device)(struct vfio_device *vdev);
void (*close_device)(struct vfio_device *vdev);
ssize_t (*read)(struct vfio_device *vdev, char __user *buf,
@@ -161,6 +166,24 @@ static inline int vfio_check_feature(u32 flags, size_t argsz, u32 supported_ops,
return 1;
}

+struct vfio_device *_vfio_alloc_device(size_t size, struct device *dev,
+ const struct vfio_device_ops *ops);
+#define vfio_alloc_device(dev_struct, member, dev, ops) \
+ container_of(_vfio_alloc_device(sizeof(struct dev_struct) + \
+ BUILD_BUG_ON_ZERO(offsetof( \
+ struct dev_struct, member)), \
+ dev, ops), \
+ struct dev_struct, member)
+
+int vfio_init_device(struct vfio_device *device, struct device *dev,
+ const struct vfio_device_ops *ops);
+void vfio_free_device(struct vfio_device *device);
+void vfio_device_release(struct kref *kref);
+static inline void vfio_put_device(struct vfio_device *device)
+{
+ kref_put(&device->kref, vfio_device_release);
+}
+
void vfio_init_group_dev(struct vfio_device *device, struct device *dev,
const struct vfio_device_ops *ops);
void vfio_uninit_group_dev(struct vfio_device *device);
--
2.21.3

2022-09-09 03:57:08

by Tian, Kevin

[permalink] [raw]
Subject: [PATCH v3 12/15] vfio/amba: Use the new device life cycle helpers

Implement amba's own vfio_device_ops.

Remove vfio_platform_probe/remove_common() given no user now.

Signed-off-by: Kevin Tian <[email protected]>
Reviewed-by: Jason Gunthorpe <[email protected]>
Reviewed-by: Eric Auger <[email protected]>
---
drivers/vfio/platform/vfio_amba.c | 72 ++++++++++++++-----
drivers/vfio/platform/vfio_platform_common.c | 60 ----------------
drivers/vfio/platform/vfio_platform_private.h | 3 -
3 files changed, 55 insertions(+), 80 deletions(-)

diff --git a/drivers/vfio/platform/vfio_amba.c b/drivers/vfio/platform/vfio_amba.c
index 1aaa4f721bd2..eaea63e5294c 100644
--- a/drivers/vfio/platform/vfio_amba.c
+++ b/drivers/vfio/platform/vfio_amba.c
@@ -7,6 +7,7 @@
#include <linux/module.h>
#include <linux/slab.h>
#include <linux/vfio.h>
+#include <linux/pm_runtime.h>
#include <linux/amba/bus.h>

#include "vfio_platform_private.h"
@@ -40,20 +41,16 @@ static int get_amba_irq(struct vfio_platform_device *vdev, int i)
return ret ? ret : -ENXIO;
}

-static int vfio_amba_probe(struct amba_device *adev, const struct amba_id *id)
+static int vfio_amba_init_dev(struct vfio_device *core_vdev)
{
- struct vfio_platform_device *vdev;
+ struct vfio_platform_device *vdev =
+ container_of(core_vdev, struct vfio_platform_device, vdev);
+ struct amba_device *adev = to_amba_device(core_vdev->dev);
int ret;

- vdev = kzalloc(sizeof(*vdev), GFP_KERNEL);
- if (!vdev)
- return -ENOMEM;
-
vdev->name = kasprintf(GFP_KERNEL, "vfio-amba-%08x", adev->periphid);
- if (!vdev->name) {
- kfree(vdev);
+ if (!vdev->name)
return -ENOMEM;
- }

vdev->opaque = (void *) adev;
vdev->flags = VFIO_DEVICE_FLAGS_AMBA;
@@ -61,26 +58,67 @@ static int vfio_amba_probe(struct amba_device *adev, const struct amba_id *id)
vdev->get_irq = get_amba_irq;
vdev->reset_required = false;

- ret = vfio_platform_probe_common(vdev, &adev->dev);
- if (ret) {
+ ret = vfio_platform_init_common(vdev);
+ if (ret)
kfree(vdev->name);
- kfree(vdev);
- return ret;
- }
+ return ret;
+}
+
+static const struct vfio_device_ops vfio_amba_ops;
+static int vfio_amba_probe(struct amba_device *adev, const struct amba_id *id)
+{
+ struct vfio_platform_device *vdev;
+ int ret;
+
+ vdev = vfio_alloc_device(vfio_platform_device, vdev, &adev->dev,
+ &vfio_amba_ops);
+ if (IS_ERR(vdev))
+ return PTR_ERR(vdev);

+ ret = vfio_register_group_dev(&vdev->vdev);
+ if (ret)
+ goto out_put_vdev;
+
+ pm_runtime_enable(&adev->dev);
dev_set_drvdata(&adev->dev, vdev);
return 0;
+
+out_put_vdev:
+ vfio_put_device(&vdev->vdev);
+ return ret;
+}
+
+static void vfio_amba_release_dev(struct vfio_device *core_vdev)
+{
+ struct vfio_platform_device *vdev =
+ container_of(core_vdev, struct vfio_platform_device, vdev);
+
+ vfio_platform_release_common(vdev);
+ kfree(vdev->name);
+ vfio_free_device(core_vdev);
}

static void vfio_amba_remove(struct amba_device *adev)
{
struct vfio_platform_device *vdev = dev_get_drvdata(&adev->dev);

- vfio_platform_remove_common(vdev);
- kfree(vdev->name);
- kfree(vdev);
+ vfio_unregister_group_dev(&vdev->vdev);
+ pm_runtime_disable(vdev->device);
+ vfio_put_device(&vdev->vdev);
}

+static const struct vfio_device_ops vfio_amba_ops = {
+ .name = "vfio-amba",
+ .init = vfio_amba_init_dev,
+ .release = vfio_amba_release_dev,
+ .open_device = vfio_platform_open_device,
+ .close_device = vfio_platform_close_device,
+ .ioctl = vfio_platform_ioctl,
+ .read = vfio_platform_read,
+ .write = vfio_platform_write,
+ .mmap = vfio_platform_mmap,
+};
+
static const struct amba_id pl330_ids[] = {
{ 0, 0 },
};
diff --git a/drivers/vfio/platform/vfio_platform_common.c b/drivers/vfio/platform/vfio_platform_common.c
index 4c01bf0adebb..55dc4f43c31e 100644
--- a/drivers/vfio/platform/vfio_platform_common.c
+++ b/drivers/vfio/platform/vfio_platform_common.c
@@ -605,16 +605,6 @@ int vfio_platform_mmap(struct vfio_device *core_vdev, struct vm_area_struct *vma
}
EXPORT_SYMBOL_GPL(vfio_platform_mmap);

-static const struct vfio_device_ops vfio_platform_ops = {
- .name = "vfio-platform",
- .open_device = vfio_platform_open_device,
- .close_device = vfio_platform_close_device,
- .ioctl = vfio_platform_ioctl,
- .read = vfio_platform_read,
- .write = vfio_platform_write,
- .mmap = vfio_platform_mmap,
-};
-
static int vfio_platform_of_probe(struct vfio_platform_device *vdev,
struct device *dev)
{
@@ -674,56 +664,6 @@ void vfio_platform_release_common(struct vfio_platform_device *vdev)
}
EXPORT_SYMBOL_GPL(vfio_platform_release_common);

-int vfio_platform_probe_common(struct vfio_platform_device *vdev,
- struct device *dev)
-{
- int ret;
-
- vfio_init_group_dev(&vdev->vdev, dev, &vfio_platform_ops);
-
- ret = vfio_platform_acpi_probe(vdev, dev);
- if (ret)
- ret = vfio_platform_of_probe(vdev, dev);
-
- if (ret)
- goto out_uninit;
-
- vdev->device = dev;
-
- ret = vfio_platform_get_reset(vdev);
- if (ret && vdev->reset_required) {
- dev_err(dev, "No reset function found for device %s\n",
- vdev->name);
- goto out_uninit;
- }
-
- ret = vfio_register_group_dev(&vdev->vdev);
- if (ret)
- goto put_reset;
-
- mutex_init(&vdev->igate);
-
- pm_runtime_enable(dev);
- return 0;
-
-put_reset:
- vfio_platform_put_reset(vdev);
-out_uninit:
- vfio_uninit_group_dev(&vdev->vdev);
- return ret;
-}
-EXPORT_SYMBOL_GPL(vfio_platform_probe_common);
-
-void vfio_platform_remove_common(struct vfio_platform_device *vdev)
-{
- vfio_unregister_group_dev(&vdev->vdev);
-
- pm_runtime_disable(vdev->device);
- vfio_platform_put_reset(vdev);
- vfio_uninit_group_dev(&vdev->vdev);
-}
-EXPORT_SYMBOL_GPL(vfio_platform_remove_common);
-
void __vfio_platform_register_reset(struct vfio_platform_reset_node *node)
{
mutex_lock(&driver_lock);
diff --git a/drivers/vfio/platform/vfio_platform_private.h b/drivers/vfio/platform/vfio_platform_private.h
index a769d649fb97..8d8fab516849 100644
--- a/drivers/vfio/platform/vfio_platform_private.h
+++ b/drivers/vfio/platform/vfio_platform_private.h
@@ -78,9 +78,6 @@ struct vfio_platform_reset_node {
vfio_platform_reset_fn_t of_reset;
};

-int vfio_platform_probe_common(struct vfio_platform_device *vdev,
- struct device *dev);
-void vfio_platform_remove_common(struct vfio_platform_device *vdev);
int vfio_platform_init_common(struct vfio_platform_device *vdev);
void vfio_platform_release_common(struct vfio_platform_device *vdev);

--
2.21.3

2022-09-09 03:57:24

by Tian, Kevin

[permalink] [raw]
Subject: [PATCH v3 07/15] vfio/mbochs: Use the new device life cycle helpers

From: Yi Liu <[email protected]>

and manage avail_mbytes inside @init/@release.

Signed-off-by: Yi Liu <[email protected]>
Signed-off-by: Kevin Tian <[email protected]>
Reviewed-by: Jason Gunthorpe <[email protected]>
---
samples/vfio-mdev/mbochs.c | 73 ++++++++++++++++++++++++--------------
1 file changed, 46 insertions(+), 27 deletions(-)

diff --git a/samples/vfio-mdev/mbochs.c b/samples/vfio-mdev/mbochs.c
index 344c2901a82b..df95f25fbc0e 100644
--- a/samples/vfio-mdev/mbochs.c
+++ b/samples/vfio-mdev/mbochs.c
@@ -505,13 +505,14 @@ static int mbochs_reset(struct mdev_state *mdev_state)
return 0;
}

-static int mbochs_probe(struct mdev_device *mdev)
+static int mbochs_init_dev(struct vfio_device *vdev)
{
- int avail_mbytes = atomic_read(&mbochs_avail_mbytes);
+ struct mdev_state *mdev_state =
+ container_of(vdev, struct mdev_state, vdev);
+ struct mdev_device *mdev = to_mdev_device(vdev->dev);
const struct mbochs_type *type =
&mbochs_types[mdev_get_type_group_id(mdev)];
- struct device *dev = mdev_dev(mdev);
- struct mdev_state *mdev_state;
+ int avail_mbytes = atomic_read(&mbochs_avail_mbytes);
int ret = -ENOMEM;

do {
@@ -520,14 +521,9 @@ static int mbochs_probe(struct mdev_device *mdev)
} while (!atomic_try_cmpxchg(&mbochs_avail_mbytes, &avail_mbytes,
avail_mbytes - type->mbytes));

- mdev_state = kzalloc(sizeof(struct mdev_state), GFP_KERNEL);
- if (mdev_state == NULL)
- goto err_avail;
- vfio_init_group_dev(&mdev_state->vdev, &mdev->dev, &mbochs_dev_ops);
-
mdev_state->vconfig = kzalloc(MBOCHS_CONFIG_SPACE_SIZE, GFP_KERNEL);
- if (mdev_state->vconfig == NULL)
- goto err_mem;
+ if (!mdev_state->vconfig)
+ goto err_avail;

mdev_state->memsize = type->mbytes * 1024 * 1024;
mdev_state->pagecount = mdev_state->memsize >> PAGE_SHIFT;
@@ -535,10 +531,7 @@ static int mbochs_probe(struct mdev_device *mdev)
sizeof(struct page *),
GFP_KERNEL);
if (!mdev_state->pages)
- goto err_mem;
-
- dev_info(dev, "%s: %s, %d MB, %ld pages\n", __func__,
- type->name, type->mbytes, mdev_state->pagecount);
+ goto err_vconfig;

mutex_init(&mdev_state->ops_lock);
mdev_state->mdev = mdev;
@@ -553,19 +546,47 @@ static int mbochs_probe(struct mdev_device *mdev)
mbochs_create_config_space(mdev_state);
mbochs_reset(mdev_state);

+ dev_info(vdev->dev, "%s: %s, %d MB, %ld pages\n", __func__,
+ type->name, type->mbytes, mdev_state->pagecount);
+ return 0;
+
+err_vconfig:
+ kfree(mdev_state->vconfig);
+err_avail:
+ atomic_add(type->mbytes, &mbochs_avail_mbytes);
+ return ret;
+}
+
+static int mbochs_probe(struct mdev_device *mdev)
+{
+ struct mdev_state *mdev_state;
+ int ret = -ENOMEM;
+
+ mdev_state = vfio_alloc_device(mdev_state, vdev, &mdev->dev,
+ &mbochs_dev_ops);
+ if (IS_ERR(mdev_state))
+ return PTR_ERR(mdev_state);
+
ret = vfio_register_emulated_iommu_dev(&mdev_state->vdev);
if (ret)
- goto err_mem;
+ goto err_put_vdev;
dev_set_drvdata(&mdev->dev, mdev_state);
return 0;
-err_mem:
- vfio_uninit_group_dev(&mdev_state->vdev);
+
+err_put_vdev:
+ vfio_put_device(&mdev_state->vdev);
+ return ret;
+}
+
+static void mbochs_release_dev(struct vfio_device *vdev)
+{
+ struct mdev_state *mdev_state =
+ container_of(vdev, struct mdev_state, vdev);
+
kfree(mdev_state->pages);
kfree(mdev_state->vconfig);
- kfree(mdev_state);
-err_avail:
- atomic_add(type->mbytes, &mbochs_avail_mbytes);
- return ret;
+ vfio_free_device(vdev);
+ atomic_add(mdev_state->type->mbytes, &mbochs_avail_mbytes);
}

static void mbochs_remove(struct mdev_device *mdev)
@@ -573,11 +594,7 @@ static void mbochs_remove(struct mdev_device *mdev)
struct mdev_state *mdev_state = dev_get_drvdata(&mdev->dev);

vfio_unregister_group_dev(&mdev_state->vdev);
- vfio_uninit_group_dev(&mdev_state->vdev);
- atomic_add(mdev_state->type->mbytes, &mbochs_avail_mbytes);
- kfree(mdev_state->pages);
- kfree(mdev_state->vconfig);
- kfree(mdev_state);
+ vfio_put_device(&mdev_state->vdev);
}

static ssize_t mbochs_read(struct vfio_device *vdev, char __user *buf,
@@ -1397,6 +1414,8 @@ static struct attribute_group *mdev_type_groups[] = {

static const struct vfio_device_ops mbochs_dev_ops = {
.close_device = mbochs_close_device,
+ .init = mbochs_init_dev,
+ .release = mbochs_release_dev,
.read = mbochs_read,
.write = mbochs_write,
.ioctl = mbochs_ioctl,
--
2.21.3

2022-09-09 03:57:37

by Tian, Kevin

[permalink] [raw]
Subject: [PATCH v3 15/15] vfio: Add struct device to vfio_device

From: Yi Liu <[email protected]>

and replace kref. With it a 'vfio-dev/vfioX' node is created under the
sysfs path of the parent, indicating the device is bound to a vfio
driver, e.g.:

/sys/devices/pci0000\:6f/0000\:6f\:01.0/vfio-dev/vfio0

It is also a preparatory step toward adding cdev for supporting future
device-oriented uAPI.

Add Documentation/ABI/testing/sysfs-devices-vfio-dev.

Also take this chance to rename chardev 'vfio' to 'vfio-group' in
/proc/devices.

Suggested-by: Jason Gunthorpe <[email protected]>
Signed-off-by: Yi Liu <[email protected]>
Signed-off-by: Kevin Tian <[email protected]>
Reviewed-by: Jason Gunthorpe <[email protected]>
---
.../ABI/testing/sysfs-devices-vfio-dev | 8 +++
MAINTAINERS | 1 +
drivers/vfio/vfio_main.c | 67 +++++++++++++++----
include/linux/vfio.h | 6 +-
4 files changed, 67 insertions(+), 15 deletions(-)
create mode 100644 Documentation/ABI/testing/sysfs-devices-vfio-dev

diff --git a/Documentation/ABI/testing/sysfs-devices-vfio-dev b/Documentation/ABI/testing/sysfs-devices-vfio-dev
new file mode 100644
index 000000000000..e21424fd9666
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-devices-vfio-dev
@@ -0,0 +1,8 @@
+What: /sys/.../<device>/vfio-dev/vfioX/
+Date: September 2022
+Contact: Yi Liu <[email protected]>
+Description:
+ This directory is created when the device is bound to a
+ vfio driver. The layout under this directory matches what
+ exists for a standard 'struct device'. 'X' is a unique
+ index marking this device in vfio.
diff --git a/MAINTAINERS b/MAINTAINERS
index d30f26e07cd3..02c8f11b1c17 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -21312,6 +21312,7 @@ R: Cornelia Huck <[email protected]>
L: [email protected]
S: Maintained
T: git git://github.com/awilliam/linux-vfio.git
+F: Documentation/ABI/testing/sysfs-devices-vfio-dev
F: Documentation/driver-api/vfio.rst
F: drivers/vfio/
F: include/linux/vfio.h
diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
index 37cbd10f3faf..cd23d35c878c 100644
--- a/drivers/vfio/vfio_main.c
+++ b/drivers/vfio/vfio_main.c
@@ -49,6 +49,8 @@ static struct vfio {
struct mutex group_lock; /* locks group_list */
struct ida group_ida;
dev_t group_devt;
+ struct class *device_class;
+ struct ida device_ida;
} vfio;

struct vfio_iommu_driver {
@@ -485,12 +487,13 @@ static struct vfio_device *vfio_group_get_device(struct vfio_group *group,
* VFIO driver API
*/
/* Release helper called by vfio_put_device() */
-void vfio_device_release(struct kref *kref)
+static void vfio_device_release(struct device *dev)
{
struct vfio_device *device =
- container_of(kref, struct vfio_device, kref);
+ container_of(dev, struct vfio_device, device);

vfio_release_device_set(device);
+ ida_free(&vfio.device_ida, device->index);

/*
* kvfree() cannot be done here due to a life cycle mess in
@@ -500,7 +503,6 @@ void vfio_device_release(struct kref *kref)
*/
device->ops->release(device);
}
-EXPORT_SYMBOL_GPL(vfio_device_release);

/*
* Alloc and initialize vfio_device so it can be registered to vfio
@@ -548,6 +550,13 @@ int vfio_init_device(struct vfio_device *device, struct device *dev,
{
int ret;

+ ret = ida_alloc_max(&vfio.device_ida, MINORMASK, GFP_KERNEL);
+ if (ret < 0) {
+ dev_dbg(dev, "Error to alloc index\n");
+ return ret;
+ }
+
+ device->index = ret;
init_completion(&device->comp);
device->dev = dev;
device->ops = ops;
@@ -558,11 +567,15 @@ int vfio_init_device(struct vfio_device *device, struct device *dev,
goto out_uninit;
}

- kref_init(&device->kref);
+ device_initialize(&device->device);
+ device->device.release = vfio_device_release;
+ device->device.class = vfio.device_class;
+ device->device.parent = device->dev;
return 0;

out_uninit:
vfio_release_device_set(device);
+ ida_free(&vfio.device_ida, device->index);
return ret;
}
EXPORT_SYMBOL_GPL(vfio_init_device);
@@ -659,6 +672,7 @@ static int __vfio_register_dev(struct vfio_device *device,
struct vfio_group *group)
{
struct vfio_device *existing_device;
+ int ret;

if (IS_ERR(group))
return PTR_ERR(group);
@@ -675,16 +689,21 @@ static int __vfio_register_dev(struct vfio_device *device,
dev_WARN(device->dev, "Device already exists on group %d\n",
iommu_group_id(group->iommu_group));
vfio_device_put_registration(existing_device);
- if (group->type == VFIO_NO_IOMMU ||
- group->type == VFIO_EMULATED_IOMMU)
- iommu_group_remove_device(device->dev);
- vfio_group_put(group);
- return -EBUSY;
+ ret = -EBUSY;
+ goto err_out;
}

/* Our reference on group is moved to the device */
device->group = group;

+ ret = dev_set_name(&device->device, "vfio%d", device->index);
+ if (ret)
+ goto err_out;
+
+ ret = device_add(&device->device);
+ if (ret)
+ goto err_out;
+
/* Refcounting can't start until the driver calls register */
refcount_set(&device->refcount, 1);

@@ -693,6 +712,12 @@ static int __vfio_register_dev(struct vfio_device *device,
mutex_unlock(&group->device_lock);

return 0;
+err_out:
+ if (group->type == VFIO_NO_IOMMU ||
+ group->type == VFIO_EMULATED_IOMMU)
+ iommu_group_remove_device(device->dev);
+ vfio_group_put(group);
+ return ret;
}

int vfio_register_group_dev(struct vfio_device *device)
@@ -779,6 +804,9 @@ void vfio_unregister_group_dev(struct vfio_device *device)
list_del(&device->group_next);
mutex_unlock(&group->device_lock);

+ /* Balances device_add in register path */
+ device_del(&device->device);
+
if (group->type == VFIO_NO_IOMMU || group->type == VFIO_EMULATED_IOMMU)
iommu_group_remove_device(device->dev);

@@ -2362,6 +2390,7 @@ static int __init vfio_init(void)
int ret;

ida_init(&vfio.group_ida);
+ ida_init(&vfio.device_ida);
mutex_init(&vfio.group_lock);
mutex_init(&vfio.iommu_drivers_lock);
INIT_LIST_HEAD(&vfio.group_list);
@@ -2377,12 +2406,20 @@ static int __init vfio_init(void)
vfio.class = class_create(THIS_MODULE, "vfio");
if (IS_ERR(vfio.class)) {
ret = PTR_ERR(vfio.class);
- goto err_class;
+ goto err_group_class;
}

vfio.class->devnode = vfio_devnode;

- ret = alloc_chrdev_region(&vfio.group_devt, 0, MINORMASK + 1, "vfio");
+ /* /sys/class/vfio-dev/vfioX */
+ vfio.device_class = class_create(THIS_MODULE, "vfio-dev");
+ if (IS_ERR(vfio.device_class)) {
+ ret = PTR_ERR(vfio.device_class);
+ goto err_dev_class;
+ }
+
+ ret = alloc_chrdev_region(&vfio.group_devt, 0, MINORMASK + 1,
+ "vfio-group");
if (ret)
goto err_alloc_chrdev;

@@ -2398,9 +2435,12 @@ static int __init vfio_init(void)
err_driver_register:
unregister_chrdev_region(vfio.group_devt, MINORMASK + 1);
err_alloc_chrdev:
+ class_destroy(vfio.device_class);
+ vfio.device_class = NULL;
+err_dev_class:
class_destroy(vfio.class);
vfio.class = NULL;
-err_class:
+err_group_class:
misc_deregister(&vfio_dev);
return ret;
}
@@ -2412,8 +2452,11 @@ static void __exit vfio_cleanup(void)
#ifdef CONFIG_VFIO_NOIOMMU
vfio_unregister_iommu_driver(&vfio_noiommu_ops);
#endif
+ ida_destroy(&vfio.device_ida);
ida_destroy(&vfio.group_ida);
unregister_chrdev_region(vfio.group_devt, MINORMASK + 1);
+ class_destroy(vfio.device_class);
+ vfio.device_class = NULL;
class_destroy(vfio.class);
vfio.class = NULL;
misc_deregister(&vfio_dev);
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index 3cf857b1eec7..ee399a768070 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -47,7 +47,8 @@ struct vfio_device {
struct kvm *kvm;

/* Members below here are private, not for driver use */
- struct kref kref; /* object life cycle */
+ unsigned int index;
+ struct device device; /* device.kref covers object life circle */
refcount_t refcount; /* user count on registered device*/
unsigned int open_count;
struct completion comp;
@@ -178,10 +179,9 @@ struct vfio_device *_vfio_alloc_device(size_t size, struct device *dev,
int vfio_init_device(struct vfio_device *device, struct device *dev,
const struct vfio_device_ops *ops);
void vfio_free_device(struct vfio_device *device);
-void vfio_device_release(struct kref *kref);
static inline void vfio_put_device(struct vfio_device *device)
{
- kref_put(&device->kref, vfio_device_release);
+ put_device(&device->device);
}

int vfio_register_group_dev(struct vfio_device *device);
--
2.21.3

2022-09-09 08:39:12

by Ethan Zhao

[permalink] [raw]
Subject: Re: [PATCH v3 01/15] vfio: Add helpers for unifying vfio_device life cycle

Hi, Kevin,

在 2022/9/9 18:22, Kevin Tian 写道:
> The idea is to let vfio core manage the vfio_device life cycle instead
> of duplicating the logic cross drivers. This is also a preparatory
> step for adding struct device into vfio_device.
>
> New pair of helpers together with a kref in vfio_device:
>
> - vfio_alloc_device()
> - vfio_put_device()

To be honest, this pair of functions make me confusing to understand their

behaviour from wording point of view:

- vfio_alloc_device(), Okay, it allocates the vfio device, no reference
count thing. but,
- vfio_put_device()
seems it will decrease reference count and then if it is zero, free it.
so they are not of one *pair* about wording.

How about

- vfio_alloc_device() / - vfio_free_device()
or
- vfio_get_device() / - vfio_put_device(), perhaps not match their behviour
in following code.



Thanks,
Ethan


>
> Drivers can register @init/@release callbacks to manage any private
> state wrapping the vfio_device.
>
> However vfio-ccw doesn't fit this model due to a life cycle mess
> that its private structure mixes both parent and mdev info hence must
> be allocated/freed outside of the life cycle of vfio device.
>
> Per prior discussions this won't be fixed in short term by IBM folks.
>
> Instead of waiting for those modifications introduce another helper
> vfio_init_device() so ccw can call it to initialize a pre-allocated
> vfio_device.
>
> Further implication of the ccw trick is that vfio_device cannot be
> freed uniformly in vfio core. Instead, require *EVERY* driver to
> implement @release and free vfio_device inside. Then ccw can choose
> to delay the free at its own discretion.
>
> Another trick down the road is that kvzalloc() is used to accommodate
> the need of gvt which uses vzalloc() while all others use kzalloc().
> So drivers should call a helper vfio_free_device() to free the
> vfio_device instead of assuming that kfree() or vfree() is appliable.
>
> Later once the ccw mess is fixed we can remove those tricks and
> fully handle structure alloc/free in vfio core.
>
> Existing vfio_{un}init_group_dev() will be deprecated after all
> existing usages are converted to the new model.
>
> Suggested-by: Jason Gunthorpe <[email protected]>
> Co-developed-by: Yi Liu <[email protected]>
> Signed-off-by: Yi Liu <[email protected]>
> Signed-off-by: Kevin Tian <[email protected]>
> Reviewed-by: Tony Krowiak <[email protected]>
> Reviewed-by: Jason Gunthorpe <[email protected]>
> Reviewed-by: Eric Auger <[email protected]>
> ---
> drivers/vfio/vfio_main.c | 92 ++++++++++++++++++++++++++++++++++++++++
> include/linux/vfio.h | 25 ++++++++++-
> 2 files changed, 116 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
> index 27d9186f35d5..adc1b697bb78 100644
> --- a/drivers/vfio/vfio_main.c
> +++ b/drivers/vfio/vfio_main.c
> @@ -498,6 +498,98 @@ void vfio_uninit_group_dev(struct vfio_device *device)
> }
> EXPORT_SYMBOL_GPL(vfio_uninit_group_dev);
>
> +/* Release helper called by vfio_put_device() */
> +void vfio_device_release(struct kref *kref)
> +{
> + struct vfio_device *device =
> + container_of(kref, struct vfio_device, kref);
> +
> + vfio_uninit_group_dev(device);
> +
> + /*
> + * kvfree() cannot be done here due to a life cycle mess in
> + * vfio-ccw. Before the ccw part is fixed all drivers are
> + * required to support @release and call vfio_free_device()
> + * from there.
> + */
> + device->ops->release(device);
> +}
> +EXPORT_SYMBOL_GPL(vfio_device_release);
> +
> +/*
> + * Alloc and initialize vfio_device so it can be registered to vfio
> + * core.
> + *
> + * Drivers should use the wrapper vfio_alloc_device() for allocation.
> + * @size is the size of the structure to be allocated, including any
> + * private data used by the driver.
> + *
> + * Driver may provide an @init callback to cover device private data.
> + *
> + * Use vfio_put_device() to release the structure after success return.
> + */
> +struct vfio_device *_vfio_alloc_device(size_t size, struct device *dev,
> + const struct vfio_device_ops *ops)
> +{
> + struct vfio_device *device;
> + int ret;
> +
> + if (WARN_ON(size < sizeof(struct vfio_device)))
> + return ERR_PTR(-EINVAL);
> +
> + device = kvzalloc(size, GFP_KERNEL);
> + if (!device)
> + return ERR_PTR(-ENOMEM);
> +
> + ret = vfio_init_device(device, dev, ops);
> + if (ret)
> + goto out_free;
> + return device;
> +
> +out_free:
> + kvfree(device);
> + return ERR_PTR(ret);
> +}
> +EXPORT_SYMBOL_GPL(_vfio_alloc_device);
> +
> +/*
> + * Initialize a vfio_device so it can be registered to vfio core.
> + *
> + * Only vfio-ccw driver should call this interface.
> + */
> +int vfio_init_device(struct vfio_device *device, struct device *dev,
> + const struct vfio_device_ops *ops)
> +{
> + int ret;
> +
> + vfio_init_group_dev(device, dev, ops);
> +
> + if (ops->init) {
> + ret = ops->init(device);
> + if (ret)
> + goto out_uninit;
> + }
> +
> + kref_init(&device->kref);
> + return 0;
> +
> +out_uninit:
> + vfio_uninit_group_dev(device);
> + return ret;
> +}
> +EXPORT_SYMBOL_GPL(vfio_init_device);
> +
> +/*
> + * The helper called by driver @release callback to free the device
> + * structure. Drivers which don't have private data to clean can
> + * simply use this helper as its @release.
> + */
> +void vfio_free_device(struct vfio_device *device)
> +{
> + kvfree(device);
> +}
> +EXPORT_SYMBOL_GPL(vfio_free_device);
> +
> static struct vfio_group *vfio_noiommu_group_alloc(struct device *dev,
> enum vfio_group_type type)
> {
> diff --git a/include/linux/vfio.h b/include/linux/vfio.h
> index 0e2826559091..f67cac700e6f 100644
> --- a/include/linux/vfio.h
> +++ b/include/linux/vfio.h
> @@ -47,7 +47,8 @@ struct vfio_device {
> struct kvm *kvm;
>
> /* Members below here are private, not for driver use */
> - refcount_t refcount;
> + struct kref kref; /* object life cycle */
> + refcount_t refcount; /* user count on registered device*/
> unsigned int open_count;
> struct completion comp;
> struct list_head group_next;
> @@ -57,6 +58,8 @@ struct vfio_device {
> /**
> * struct vfio_device_ops - VFIO bus driver device callbacks
> *
> + * @init: initialize private fields in device structure
> + * @release: Reclaim private fields in device structure
> * @open_device: Called when the first file descriptor is opened for this device
> * @close_device: Opposite of open_device
> * @read: Perform read(2) on device file descriptor
> @@ -74,6 +77,8 @@ struct vfio_device {
> */
> struct vfio_device_ops {
> char *name;
> + int (*init)(struct vfio_device *vdev);
> + void (*release)(struct vfio_device *vdev);
> int (*open_device)(struct vfio_device *vdev);
> void (*close_device)(struct vfio_device *vdev);
> ssize_t (*read)(struct vfio_device *vdev, char __user *buf,
> @@ -161,6 +166,24 @@ static inline int vfio_check_feature(u32 flags, size_t argsz, u32 supported_ops,
> return 1;
> }
>
> +struct vfio_device *_vfio_alloc_device(size_t size, struct device *dev,
> + const struct vfio_device_ops *ops);
> +#define vfio_alloc_device(dev_struct, member, dev, ops) \
> + container_of(_vfio_alloc_device(sizeof(struct dev_struct) + \
> + BUILD_BUG_ON_ZERO(offsetof( \
> + struct dev_struct, member)), \
> + dev, ops), \
> + struct dev_struct, member)
> +
> +int vfio_init_device(struct vfio_device *device, struct device *dev,
> + const struct vfio_device_ops *ops);
> +void vfio_free_device(struct vfio_device *device);
> +void vfio_device_release(struct kref *kref);
> +static inline void vfio_put_device(struct vfio_device *device)
> +{
> + kref_put(&device->kref, vfio_device_release);
> +}
> +
> void vfio_init_group_dev(struct vfio_device *device, struct device *dev,
> const struct vfio_device_ops *ops);
> void vfio_uninit_group_dev(struct vfio_device *device);

--
"firm, enduring, strong, and long-lived"

2022-09-09 08:54:51

by Tian, Kevin

[permalink] [raw]
Subject: RE: [PATCH v3 01/15] vfio: Add helpers for unifying vfio_device life cycle

> From: Ethan Zhao <[email protected]>
> Sent: Friday, September 9, 2022 4:24 PM
>
> Hi, Kevin,
>
> 在 2022/9/9 18:22, Kevin Tian 写道:
> > The idea is to let vfio core manage the vfio_device life cycle instead
> > of duplicating the logic cross drivers. This is also a preparatory
> > step for adding struct device into vfio_device.
> >
> > New pair of helpers together with a kref in vfio_device:
> >
> > - vfio_alloc_device()
> > - vfio_put_device()
>
> To be honest, this pair of functions make me confusing to understand their
>
> behaviour from wording point of view:
>
> - vfio_alloc_device(), Okay, it allocates the vfio device, no reference
> count thing. but,

I think it's quite common to have an alloc() helper initialize refcount, e.g.
vfio_group_alloc() both initialize its user refcount and also call
device_initialize() to gets kref initialized. Similar example in
ib_alloc_device(), etc.

> - vfio_put_device()
> seems it will decrease reference count and then if it is zero, free it.
> so they are not of one *pair* about wording.
>
> How about
>
> - vfio_alloc_device() / - vfio_free_device()
> or
> - vfio_get_device() / - vfio_put_device(), perhaps not match their behviour
> in following code.
>
>
>
> Thanks,
> Ethan
>
>
> >
> > Drivers can register @init/@release callbacks to manage any private
> > state wrapping the vfio_device.
> >
> > However vfio-ccw doesn't fit this model due to a life cycle mess
> > that its private structure mixes both parent and mdev info hence must
> > be allocated/freed outside of the life cycle of vfio device.
> >
> > Per prior discussions this won't be fixed in short term by IBM folks.
> >
> > Instead of waiting for those modifications introduce another helper
> > vfio_init_device() so ccw can call it to initialize a pre-allocated
> > vfio_device.
> >
> > Further implication of the ccw trick is that vfio_device cannot be
> > freed uniformly in vfio core. Instead, require *EVERY* driver to
> > implement @release and free vfio_device inside. Then ccw can choose
> > to delay the free at its own discretion.
> >
> > Another trick down the road is that kvzalloc() is used to accommodate
> > the need of gvt which uses vzalloc() while all others use kzalloc().
> > So drivers should call a helper vfio_free_device() to free the
> > vfio_device instead of assuming that kfree() or vfree() is appliable.
> >
> > Later once the ccw mess is fixed we can remove those tricks and
> > fully handle structure alloc/free in vfio core.
> >
> > Existing vfio_{un}init_group_dev() will be deprecated after all
> > existing usages are converted to the new model.
> >
> > Suggested-by: Jason Gunthorpe <[email protected]>
> > Co-developed-by: Yi Liu <[email protected]>
> > Signed-off-by: Yi Liu <[email protected]>
> > Signed-off-by: Kevin Tian <[email protected]>
> > Reviewed-by: Tony Krowiak <[email protected]>
> > Reviewed-by: Jason Gunthorpe <[email protected]>
> > Reviewed-by: Eric Auger <[email protected]>
> > ---
> > drivers/vfio/vfio_main.c | 92
> ++++++++++++++++++++++++++++++++++++++++
> > include/linux/vfio.h | 25 ++++++++++-
> > 2 files changed, 116 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
> > index 27d9186f35d5..adc1b697bb78 100644
> > --- a/drivers/vfio/vfio_main.c
> > +++ b/drivers/vfio/vfio_main.c
> > @@ -498,6 +498,98 @@ void vfio_uninit_group_dev(struct vfio_device
> *device)
> > }
> > EXPORT_SYMBOL_GPL(vfio_uninit_group_dev);
> >
> > +/* Release helper called by vfio_put_device() */
> > +void vfio_device_release(struct kref *kref)
> > +{
> > + struct vfio_device *device =
> > + container_of(kref, struct vfio_device, kref);
> > +
> > + vfio_uninit_group_dev(device);
> > +
> > + /*
> > + * kvfree() cannot be done here due to a life cycle mess in
> > + * vfio-ccw. Before the ccw part is fixed all drivers are
> > + * required to support @release and call vfio_free_device()
> > + * from there.
> > + */
> > + device->ops->release(device);
> > +}
> > +EXPORT_SYMBOL_GPL(vfio_device_release);
> > +
> > +/*
> > + * Alloc and initialize vfio_device so it can be registered to vfio
> > + * core.
> > + *
> > + * Drivers should use the wrapper vfio_alloc_device() for allocation.
> > + * @size is the size of the structure to be allocated, including any
> > + * private data used by the driver.
> > + *
> > + * Driver may provide an @init callback to cover device private data.
> > + *
> > + * Use vfio_put_device() to release the structure after success return.
> > + */
> > +struct vfio_device *_vfio_alloc_device(size_t size, struct device *dev,
> > + const struct vfio_device_ops *ops)
> > +{
> > + struct vfio_device *device;
> > + int ret;
> > +
> > + if (WARN_ON(size < sizeof(struct vfio_device)))
> > + return ERR_PTR(-EINVAL);
> > +
> > + device = kvzalloc(size, GFP_KERNEL);
> > + if (!device)
> > + return ERR_PTR(-ENOMEM);
> > +
> > + ret = vfio_init_device(device, dev, ops);
> > + if (ret)
> > + goto out_free;
> > + return device;
> > +
> > +out_free:
> > + kvfree(device);
> > + return ERR_PTR(ret);
> > +}
> > +EXPORT_SYMBOL_GPL(_vfio_alloc_device);
> > +
> > +/*
> > + * Initialize a vfio_device so it can be registered to vfio core.
> > + *
> > + * Only vfio-ccw driver should call this interface.
> > + */
> > +int vfio_init_device(struct vfio_device *device, struct device *dev,
> > + const struct vfio_device_ops *ops)
> > +{
> > + int ret;
> > +
> > + vfio_init_group_dev(device, dev, ops);
> > +
> > + if (ops->init) {
> > + ret = ops->init(device);
> > + if (ret)
> > + goto out_uninit;
> > + }
> > +
> > + kref_init(&device->kref);
> > + return 0;
> > +
> > +out_uninit:
> > + vfio_uninit_group_dev(device);
> > + return ret;
> > +}
> > +EXPORT_SYMBOL_GPL(vfio_init_device);
> > +
> > +/*
> > + * The helper called by driver @release callback to free the device
> > + * structure. Drivers which don't have private data to clean can
> > + * simply use this helper as its @release.
> > + */
> > +void vfio_free_device(struct vfio_device *device)
> > +{
> > + kvfree(device);
> > +}
> > +EXPORT_SYMBOL_GPL(vfio_free_device);
> > +
> > static struct vfio_group *vfio_noiommu_group_alloc(struct device *dev,
> > enum vfio_group_type type)
> > {
> > diff --git a/include/linux/vfio.h b/include/linux/vfio.h
> > index 0e2826559091..f67cac700e6f 100644
> > --- a/include/linux/vfio.h
> > +++ b/include/linux/vfio.h
> > @@ -47,7 +47,8 @@ struct vfio_device {
> > struct kvm *kvm;
> >
> > /* Members below here are private, not for driver use */
> > - refcount_t refcount;
> > + struct kref kref; /* object life cycle */
> > + refcount_t refcount; /* user count on registered device*/
> > unsigned int open_count;
> > struct completion comp;
> > struct list_head group_next;
> > @@ -57,6 +58,8 @@ struct vfio_device {
> > /**
> > * struct vfio_device_ops - VFIO bus driver device callbacks
> > *
> > + * @init: initialize private fields in device structure
> > + * @release: Reclaim private fields in device structure
> > * @open_device: Called when the first file descriptor is opened for this
> device
> > * @close_device: Opposite of open_device
> > * @read: Perform read(2) on device file descriptor
> > @@ -74,6 +77,8 @@ struct vfio_device {
> > */
> > struct vfio_device_ops {
> > char *name;
> > + int (*init)(struct vfio_device *vdev);
> > + void (*release)(struct vfio_device *vdev);
> > int (*open_device)(struct vfio_device *vdev);
> > void (*close_device)(struct vfio_device *vdev);
> > ssize_t (*read)(struct vfio_device *vdev, char __user *buf,
> > @@ -161,6 +166,24 @@ static inline int vfio_check_feature(u32 flags,
> size_t argsz, u32 supported_ops,
> > return 1;
> > }
> >
> > +struct vfio_device *_vfio_alloc_device(size_t size, struct device *dev,
> > + const struct vfio_device_ops *ops);
> > +#define vfio_alloc_device(dev_struct, member, dev, ops)
> \
> > + container_of(_vfio_alloc_device(sizeof(struct dev_struct) +
> \
> > + BUILD_BUG_ON_ZERO(offsetof(
> \
> > + struct dev_struct, member)),
> \
> > + dev, ops),
> \
> > + struct dev_struct, member)
> > +
> > +int vfio_init_device(struct vfio_device *device, struct device *dev,
> > + const struct vfio_device_ops *ops);
> > +void vfio_free_device(struct vfio_device *device);
> > +void vfio_device_release(struct kref *kref);
> > +static inline void vfio_put_device(struct vfio_device *device)
> > +{
> > + kref_put(&device->kref, vfio_device_release);
> > +}
> > +
> > void vfio_init_group_dev(struct vfio_device *device, struct device *dev,
> > const struct vfio_device_ops *ops);
> > void vfio_uninit_group_dev(struct vfio_device *device);
>
> --
> "firm, enduring, strong, and long-lived"

2022-09-10 00:29:36

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v3 01/15] vfio: Add helpers for unifying vfio_device life cycle

On Fri, Sep 09, 2022 at 08:42:25AM +0000, Tian, Kevin wrote:

> I think it's quite common to have an alloc() helper initialize refcount, e.g.
> vfio_group_alloc() both initialize its user refcount and also call
> device_initialize() to gets kref initialized. Similar example in
> ib_alloc_device(), etc.

Right, it is quite a good/common pattern to have an allocation function
return a refcount to the caller.

I don't know of any naming standard for this however.

Jason

2022-09-20 20:16:59

by Alex Williamson

[permalink] [raw]
Subject: Re: [PATCH v3 06/15] vfio/mtty: Use the new device life cycle helpers

On Fri, 9 Sep 2022 18:22:38 +0800
Kevin Tian <[email protected]> wrote:

> From: Yi Liu <[email protected]>
>
> and manage available ports inside @init/@release.
>
> Signed-off-by: Yi Liu <[email protected]>
> Signed-off-by: Kevin Tian <[email protected]>
> Reviewed-by: Jason Gunthorpe <[email protected]>
> ---
> samples/vfio-mdev/mtty.c | 67 +++++++++++++++++++++++-----------------
> 1 file changed, 39 insertions(+), 28 deletions(-)
>
> diff --git a/samples/vfio-mdev/mtty.c b/samples/vfio-mdev/mtty.c
> index f42a59ed2e3f..41301d50b247 100644
> --- a/samples/vfio-mdev/mtty.c
> +++ b/samples/vfio-mdev/mtty.c
...
> +static int mtty_probe(struct mdev_device *mdev)
> +{
> + struct mdev_state *mdev_state;
> + int ret;
> +
> + mdev_state = vfio_alloc_device(mdev_state, vdev, &mdev->dev,
> + &mtty_dev_ops);
> + if (IS_ERR(mdev_state))
> + return PTR_ERR(mdev_state);
>
> ret = vfio_register_emulated_iommu_dev(&mdev_state->vdev);
> if (ret)
> - goto err_vconfig;
> + goto err_put_vdev;
> dev_set_drvdata(&mdev->dev, mdev_state);
> return 0;
>
> -err_vconfig:
> - kfree(mdev_state->vconfig);
> -err_state:
> - vfio_uninit_group_dev(&mdev_state->vdev);
> - kfree(mdev_state);
> -err_nr_ports:
> - atomic_add(nr_ports, &mdev_avail_ports);
> +err_put_vdev:
> + vfio_put_device(&mdev_state->vdev);
> return ret;
> }
>
> +static void mtty_release_dev(struct vfio_device *vdev)
> +{
> + struct mdev_state *mdev_state =
> + container_of(vdev, struct mdev_state, vdev);
> +
> + kfree(mdev_state->vconfig);
> + vfio_free_device(vdev);
> + atomic_add(mdev_state->nr_ports, &mdev_avail_ports);

I must be missing something, isn't this a use-after-free?

mdev_state is allocated via vfio_alloc_device(), where vdev is the
first entry in that structure, so this is equivalent to
kvfree(mdev_state). mbochs has the same issue. mdpy and vfio-ap
adjust global counters after vfio_free_device(), which I think muddies
the situation. Shouldn't we look suspiciously at any .release callback
where vfio_free_device() isn't the last thing executed? Thanks,

Alex

2022-09-20 20:34:09

by Alex Williamson

[permalink] [raw]
Subject: Re: [PATCH v3 15/15] vfio: Add struct device to vfio_device

On Fri, 9 Sep 2022 18:22:47 +0800
Kevin Tian <[email protected]> wrote:

> From: Yi Liu <[email protected]>
>
> and replace kref. With it a 'vfio-dev/vfioX' node is created under the
> sysfs path of the parent, indicating the device is bound to a vfio
> driver, e.g.:
>
> /sys/devices/pci0000\:6f/0000\:6f\:01.0/vfio-dev/vfio0
>
> It is also a preparatory step toward adding cdev for supporting future
> device-oriented uAPI.
>
> Add Documentation/ABI/testing/sysfs-devices-vfio-dev.
>
> Also take this chance to rename chardev 'vfio' to 'vfio-group' in
> /proc/devices.

What's the risk/reward here, is this just more aesthetically pleasing
symmetry vs 'vfio-dev'? The char major number to name association in
/proc/devices seems pretty obscure, but what due diligence have we done
to make sure this doesn't break anyone? Thanks,

Alex

2022-09-20 23:22:47

by Tian, Kevin

[permalink] [raw]
Subject: RE: [PATCH v3 15/15] vfio: Add struct device to vfio_device

> From: Alex Williamson <[email protected]>
> Sent: Wednesday, September 21, 2022 4:27 AM
>
> On Fri, 9 Sep 2022 18:22:47 +0800
> Kevin Tian <[email protected]> wrote:
>
> > From: Yi Liu <[email protected]>
> >
> > and replace kref. With it a 'vfio-dev/vfioX' node is created under the
> > sysfs path of the parent, indicating the device is bound to a vfio
> > driver, e.g.:
> >
> > /sys/devices/pci0000\:6f/0000\:6f\:01.0/vfio-dev/vfio0
> >
> > It is also a preparatory step toward adding cdev for supporting future
> > device-oriented uAPI.
> >
> > Add Documentation/ABI/testing/sysfs-devices-vfio-dev.
> >
> > Also take this chance to rename chardev 'vfio' to 'vfio-group' in
> > /proc/devices.
>
> What's the risk/reward here, is this just more aesthetically pleasing
> symmetry vs 'vfio-dev'? The char major number to name association in
> /proc/devices seems pretty obscure, but what due diligence have we done
> to make sure this doesn't break anyone? Thanks,
>

I'm not sure whether the content of /proc/devices is considered as ABI.

@Jason?

But to be safe I can remove this change in next version. If it's the right
thing to do such change after discussion then it can be done in a separate
patch after.

2022-09-20 23:30:11

by Tian, Kevin

[permalink] [raw]
Subject: RE: [PATCH v3 06/15] vfio/mtty: Use the new device life cycle helpers

> From: Alex Williamson <[email protected]>
> Sent: Wednesday, September 21, 2022 3:17 AM
>
> On Fri, 9 Sep 2022 18:22:38 +0800
> Kevin Tian <[email protected]> wrote:
>
> > From: Yi Liu <[email protected]>
> >
> > and manage available ports inside @init/@release.
> >
> > Signed-off-by: Yi Liu <[email protected]>
> > Signed-off-by: Kevin Tian <[email protected]>
> > Reviewed-by: Jason Gunthorpe <[email protected]>
> > ---
> > samples/vfio-mdev/mtty.c | 67 +++++++++++++++++++++++-----------------
> > 1 file changed, 39 insertions(+), 28 deletions(-)
> >
> > diff --git a/samples/vfio-mdev/mtty.c b/samples/vfio-mdev/mtty.c
> > index f42a59ed2e3f..41301d50b247 100644
> > --- a/samples/vfio-mdev/mtty.c
> > +++ b/samples/vfio-mdev/mtty.c
> ...
> > +static int mtty_probe(struct mdev_device *mdev)
> > +{
> > + struct mdev_state *mdev_state;
> > + int ret;
> > +
> > + mdev_state = vfio_alloc_device(mdev_state, vdev, &mdev->dev,
> > + &mtty_dev_ops);
> > + if (IS_ERR(mdev_state))
> > + return PTR_ERR(mdev_state);
> >
> > ret = vfio_register_emulated_iommu_dev(&mdev_state->vdev);
> > if (ret)
> > - goto err_vconfig;
> > + goto err_put_vdev;
> > dev_set_drvdata(&mdev->dev, mdev_state);
> > return 0;
> >
> > -err_vconfig:
> > - kfree(mdev_state->vconfig);
> > -err_state:
> > - vfio_uninit_group_dev(&mdev_state->vdev);
> > - kfree(mdev_state);
> > -err_nr_ports:
> > - atomic_add(nr_ports, &mdev_avail_ports);
> > +err_put_vdev:
> > + vfio_put_device(&mdev_state->vdev);
> > return ret;
> > }
> >
> > +static void mtty_release_dev(struct vfio_device *vdev)
> > +{
> > + struct mdev_state *mdev_state =
> > + container_of(vdev, struct mdev_state, vdev);
> > +
> > + kfree(mdev_state->vconfig);
> > + vfio_free_device(vdev);
> > + atomic_add(mdev_state->nr_ports, &mdev_avail_ports);
>
> I must be missing something, isn't this a use-after-free?

Yes, it's a use-after-free indeed. Thanks for catching it!

>
> mdev_state is allocated via vfio_alloc_device(), where vdev is the
> first entry in that structure, so this is equivalent to
> kvfree(mdev_state). mbochs has the same issue. mdpy and vfio-ap
> adjust global counters after vfio_free_device(), which I think muddies
> the situation. Shouldn't we look suspiciously at any .release callback
> where vfio_free_device() isn't the last thing executed? Thanks,
>

Yes. I'll scrutinize it again. To be consistent I'll make sure the free
is the last line in all .release callbacks though not required for
some (e.g. mdpy and vfio-ap).

2022-09-21 17:10:46

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v3 15/15] vfio: Add struct device to vfio_device

On Tue, Sep 20, 2022 at 10:55:40PM +0000, Tian, Kevin wrote:
> > From: Alex Williamson <[email protected]>
> > Sent: Wednesday, September 21, 2022 4:27 AM
> >
> > On Fri, 9 Sep 2022 18:22:47 +0800
> > Kevin Tian <[email protected]> wrote:
> >
> > > From: Yi Liu <[email protected]>
> > >
> > > and replace kref. With it a 'vfio-dev/vfioX' node is created under the
> > > sysfs path of the parent, indicating the device is bound to a vfio
> > > driver, e.g.:
> > >
> > > /sys/devices/pci0000\:6f/0000\:6f\:01.0/vfio-dev/vfio0
> > >
> > > It is also a preparatory step toward adding cdev for supporting future
> > > device-oriented uAPI.
> > >
> > > Add Documentation/ABI/testing/sysfs-devices-vfio-dev.
> > >
> > > Also take this chance to rename chardev 'vfio' to 'vfio-group' in
> > > /proc/devices.
> >
> > What's the risk/reward here, is this just more aesthetically pleasing
> > symmetry vs 'vfio-dev'? The char major number to name association in
> > /proc/devices seems pretty obscure, but what due diligence have we done
> > to make sure this doesn't break anyone? Thanks,
>
> I'm not sure whether the content of /proc/devices is considered as ABI.
>
> @Jason?

Ah, I've forgotten why we got here - didn't we have a naming conflict
with the new stuff that is being introduced?

ABI wise it is not a problem unless there is a real user, I'm not
aware of anything scanning /proc, that has been obsoleted by sysfs a
long time ago.

Jason

2022-09-22 05:27:20

by Tian, Kevin

[permalink] [raw]
Subject: RE: [PATCH v3 15/15] vfio: Add struct device to vfio_device

> From: Jason Gunthorpe <[email protected]>
> Sent: Thursday, September 22, 2022 12:10 AM
>
> On Tue, Sep 20, 2022 at 10:55:40PM +0000, Tian, Kevin wrote:
> > > From: Alex Williamson <[email protected]>
> > > Sent: Wednesday, September 21, 2022 4:27 AM
> > >
> > > On Fri, 9 Sep 2022 18:22:47 +0800
> > > Kevin Tian <[email protected]> wrote:
> > >
> > > > From: Yi Liu <[email protected]>
> > > >
> > > > and replace kref. With it a 'vfio-dev/vfioX' node is created under the
> > > > sysfs path of the parent, indicating the device is bound to a vfio
> > > > driver, e.g.:
> > > >
> > > > /sys/devices/pci0000\:6f/0000\:6f\:01.0/vfio-dev/vfio0
> > > >
> > > > It is also a preparatory step toward adding cdev for supporting future
> > > > device-oriented uAPI.
> > > >
> > > > Add Documentation/ABI/testing/sysfs-devices-vfio-dev.
> > > >
> > > > Also take this chance to rename chardev 'vfio' to 'vfio-group' in
> > > > /proc/devices.
> > >
> > > What's the risk/reward here, is this just more aesthetically pleasing
> > > symmetry vs 'vfio-dev'? The char major number to name association in
> > > /proc/devices seems pretty obscure, but what due diligence have we
> done
> > > to make sure this doesn't break anyone? Thanks,
> >
> > I'm not sure whether the content of /proc/devices is considered as ABI.
> >
> > @Jason?
>
> Ah, I've forgotten why we got here - didn't we have a naming conflict
> with the new stuff that is being introduced?

No, we don't have. There is no new char dev introduced in this series.

Later when device cdev is added a new device major will be allocated for
'vfio-dev'. It's at that time renaming existing 'vfio' to 'vfio-group' is probably
clearer, which is what I understood from your earlier suggestion.

>
> ABI wise it is not a problem unless there is a real user, I'm not
> aware of anything scanning /proc, that has been obsoleted by sysfs a
> long time ago.
>

This is a good news.