2016-12-22 20:21:57

by Alex Williamson

[permalink] [raw]
Subject: [PATCH v2 0/4] vfio-mdev: Fix remove race, clean namespace and better define ABI

Cleanup the namespace a bit by prefixing structures with mdev_ and
also more concretely define the mdev interface. Structs with comments
defining which fields are private vs public tempts poor behavior,
especially for an interface where we expect out of tree vendor drivers.

Additionally in v2, the patch removing the next field from mdev_device
is dropped, instead using it to fix a remove race, my From address is
fixed, and Documentation is updated. Jike, I left your R-b on the
patches that didn't change only. If I've missed any relevant doc
updates, please let me know. Thanks,

Alex

---

Alex Williamson (4):
vfio-mdev: Fix remove race
vfio-mdev: de-polute the namespace, rename parent_device & parent_ops
vfio-mdev: Make mdev_parent private
vfio-mdev: Make mdev_device private and abstract interfaces


Documentation/vfio-mediated-device.txt | 27 +++++----
drivers/gpu/drm/i915/gvt/kvmgt.c | 22 ++++---
drivers/vfio/mdev/mdev_core.c | 100 +++++++++++++++++++++++++++-----
drivers/vfio/mdev/mdev_private.h | 29 ++++++++-
drivers/vfio/mdev/mdev_sysfs.c | 8 +--
drivers/vfio/mdev/vfio_mdev.c | 12 ++--
include/linux/mdev.h | 54 ++++-------------
samples/vfio-mdev/mtty.c | 28 +++++----
8 files changed, 173 insertions(+), 107 deletions(-)


2016-12-22 20:22:07

by Alex Williamson

[permalink] [raw]
Subject: [PATCH v2 2/4] vfio-mdev: de-polute the namespace, rename parent_device & parent_ops

Add an mdev_ prefix so we're not poluting the namespace so much.

Cc: Kirti Wankhede <[email protected]>
Cc: Zhenyu Wang <[email protected]>
Cc: Zhi Wang <[email protected]>
Cc: Jike Song <[email protected]>
Signed-off-by: Alex Williamson <[email protected]>
---
Documentation/vfio-mediated-device.txt | 24 ++++++++++++------------
drivers/gpu/drm/i915/gvt/kvmgt.c | 2 +-
drivers/vfio/mdev/mdev_core.c | 28 ++++++++++++++--------------
drivers/vfio/mdev/mdev_private.h | 6 +++---
drivers/vfio/mdev/mdev_sysfs.c | 8 ++++----
drivers/vfio/mdev/vfio_mdev.c | 12 ++++++------
include/linux/mdev.h | 16 ++++++++--------
samples/vfio-mdev/mtty.c | 2 +-
8 files changed, 49 insertions(+), 49 deletions(-)

diff --git a/Documentation/vfio-mediated-device.txt b/Documentation/vfio-mediated-device.txt
index b38afec..cfee106 100644
--- a/Documentation/vfio-mediated-device.txt
+++ b/Documentation/vfio-mediated-device.txt
@@ -127,22 +127,22 @@ the VFIO when devices are unbound from the driver.
Physical Device Driver Interface
--------------------------------

-The physical device driver interface provides the parent_ops[3] structure to
-define the APIs to manage work in the mediated core driver that is related to
-the physical device.
+The physical device driver interface provides the mdev_parent_ops[3] structure
+to define the APIs to manage work in the mediated core driver that is related
+to the physical device.

-The structures in the parent_ops structure are as follows:
+The structures in the mdev_parent_ops structure are as follows:

* dev_attr_groups: attributes of the parent device
* mdev_attr_groups: attributes of the mediated device
* supported_config: attributes to define supported configurations

-The functions in the parent_ops structure are as follows:
+The functions in the mdev_parent_ops structure are as follows:

* create: allocate basic resources in a driver for a mediated device
* remove: free resources in a driver when a mediated device is destroyed

-The callbacks in the parent_ops structure are as follows:
+The callbacks in the mdev_parent_ops structure are as follows:

* open: open callback of mediated device
* close: close callback of mediated device
@@ -151,14 +151,14 @@ The callbacks in the parent_ops structure are as follows:
* write: write emulation callback
* mmap: mmap emulation callback

-A driver should use the parent_ops structure in the function call to register
-itself with the mdev core driver:
+A driver should use the mdev_parent_ops structure in the function call to
+register itself with the mdev core driver:

extern int mdev_register_device(struct device *dev,
- const struct parent_ops *ops);
+ const struct mdev_parent_ops *ops);

-However, the parent_ops structure is not required in the function call that a
-driver should use to unregister itself with the mdev core driver:
+However, the mdev_parent_ops structure is not required in the function call
+that a driver should use to unregister itself with the mdev core driver:

extern void mdev_unregister_device(struct device *dev);

@@ -394,5 +394,5 @@ References

[1] See Documentation/vfio.txt for more information on VFIO.
[2] struct mdev_driver in include/linux/mdev.h
-[3] struct parent_ops in include/linux/mdev.h
+[3] struct mdev_parent_ops in include/linux/mdev.h
[4] struct vfio_iommu_driver_ops in include/linux/vfio.h
diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c b/drivers/gpu/drm/i915/gvt/kvmgt.c
index 4dd6722..081ada2 100644
--- a/drivers/gpu/drm/i915/gvt/kvmgt.c
+++ b/drivers/gpu/drm/i915/gvt/kvmgt.c
@@ -1089,7 +1089,7 @@ static long intel_vgpu_ioctl(struct mdev_device *mdev, unsigned int cmd,
return 0;
}

-static const struct parent_ops intel_vgpu_ops = {
+static const struct mdev_parent_ops intel_vgpu_ops = {
.supported_type_groups = intel_vgpu_type_groups,
.create = intel_vgpu_create,
.remove = intel_vgpu_remove,
diff --git a/drivers/vfio/mdev/mdev_core.c b/drivers/vfio/mdev/mdev_core.c
index 6bb4d4c..bf3b3b0 100644
--- a/drivers/vfio/mdev/mdev_core.c
+++ b/drivers/vfio/mdev/mdev_core.c
@@ -45,7 +45,7 @@ static int _find_mdev_device(struct device *dev, void *data)
return 0;
}

-static bool mdev_device_exist(struct parent_device *parent, uuid_le uuid)
+static bool mdev_device_exist(struct mdev_parent *parent, uuid_le uuid)
{
struct device *dev;

@@ -59,9 +59,9 @@ static bool mdev_device_exist(struct parent_device *parent, uuid_le uuid)
}

/* Should be called holding parent_list_lock */
-static struct parent_device *__find_parent_device(struct device *dev)
+static struct mdev_parent *__find_parent_device(struct device *dev)
{
- struct parent_device *parent;
+ struct mdev_parent *parent;

list_for_each_entry(parent, &parent_list, next) {
if (parent->dev == dev)
@@ -72,8 +72,8 @@ static struct parent_device *__find_parent_device(struct device *dev)

static void mdev_release_parent(struct kref *kref)
{
- struct parent_device *parent = container_of(kref, struct parent_device,
- ref);
+ struct mdev_parent *parent = container_of(kref, struct mdev_parent,
+ ref);
struct device *dev = parent->dev;

kfree(parent);
@@ -81,7 +81,7 @@ static void mdev_release_parent(struct kref *kref)
}

static
-inline struct parent_device *mdev_get_parent(struct parent_device *parent)
+inline struct mdev_parent *mdev_get_parent(struct mdev_parent *parent)
{
if (parent)
kref_get(&parent->ref);
@@ -89,7 +89,7 @@ inline struct parent_device *mdev_get_parent(struct parent_device *parent)
return parent;
}

-static inline void mdev_put_parent(struct parent_device *parent)
+static inline void mdev_put_parent(struct mdev_parent *parent)
{
if (parent)
kref_put(&parent->ref, mdev_release_parent);
@@ -98,7 +98,7 @@ static inline void mdev_put_parent(struct parent_device *parent)
static int mdev_device_create_ops(struct kobject *kobj,
struct mdev_device *mdev)
{
- struct parent_device *parent = mdev->parent;
+ struct mdev_parent *parent = mdev->parent;
int ret;

ret = parent->ops->create(kobj, mdev);
@@ -125,7 +125,7 @@ static int mdev_device_create_ops(struct kobject *kobj,
*/
static int mdev_device_remove_ops(struct mdev_device *mdev, bool force_remove)
{
- struct parent_device *parent = mdev->parent;
+ struct mdev_parent *parent = mdev->parent;
int ret;

/*
@@ -156,10 +156,10 @@ static int mdev_device_remove_cb(struct device *dev, void *data)
* Add device to list of registered parent devices.
* Returns a negative value on error, otherwise 0.
*/
-int mdev_register_device(struct device *dev, const struct parent_ops *ops)
+int mdev_register_device(struct device *dev, const struct mdev_parent_ops *ops)
{
int ret;
- struct parent_device *parent;
+ struct mdev_parent *parent;

/* check for mandatory ops */
if (!ops || !ops->create || !ops->remove || !ops->supported_type_groups)
@@ -232,7 +232,7 @@ int mdev_register_device(struct device *dev, const struct parent_ops *ops)

void mdev_unregister_device(struct device *dev)
{
- struct parent_device *parent;
+ struct mdev_parent *parent;
bool force_remove = true;

mutex_lock(&parent_list_lock);
@@ -269,7 +269,7 @@ int mdev_device_create(struct kobject *kobj, struct device *dev, uuid_le uuid)
{
int ret;
struct mdev_device *mdev;
- struct parent_device *parent;
+ struct mdev_parent *parent;
struct mdev_type *type = to_mdev_type(kobj);

parent = mdev_get_parent(type->parent);
@@ -338,7 +338,7 @@ int mdev_device_create(struct kobject *kobj, struct device *dev, uuid_le uuid)
int mdev_device_remove(struct device *dev, bool force_remove)
{
struct mdev_device *mdev, *tmp;
- struct parent_device *parent;
+ struct mdev_parent *parent;
struct mdev_type *type;
int ret;
bool found = false;
diff --git a/drivers/vfio/mdev/mdev_private.h b/drivers/vfio/mdev/mdev_private.h
index d35097c..0b72c2d9 100644
--- a/drivers/vfio/mdev/mdev_private.h
+++ b/drivers/vfio/mdev/mdev_private.h
@@ -19,7 +19,7 @@
struct mdev_type {
struct kobject kobj;
struct kobject *devices_kobj;
- struct parent_device *parent;
+ struct mdev_parent *parent;
struct list_head next;
struct attribute_group *group;
};
@@ -29,8 +29,8 @@ struct mdev_type {
#define to_mdev_type(_kobj) \
container_of(_kobj, struct mdev_type, kobj)

-int parent_create_sysfs_files(struct parent_device *parent);
-void parent_remove_sysfs_files(struct parent_device *parent);
+int parent_create_sysfs_files(struct mdev_parent *parent);
+void parent_remove_sysfs_files(struct mdev_parent *parent);

int mdev_create_sysfs_files(struct device *dev, struct mdev_type *type);
void mdev_remove_sysfs_files(struct device *dev, struct mdev_type *type);
diff --git a/drivers/vfio/mdev/mdev_sysfs.c b/drivers/vfio/mdev/mdev_sysfs.c
index 1a53deb..802df21 100644
--- a/drivers/vfio/mdev/mdev_sysfs.c
+++ b/drivers/vfio/mdev/mdev_sysfs.c
@@ -92,7 +92,7 @@ static void mdev_type_release(struct kobject *kobj)
.release = mdev_type_release,
};

-struct mdev_type *add_mdev_supported_type(struct parent_device *parent,
+struct mdev_type *add_mdev_supported_type(struct mdev_parent *parent,
struct attribute_group *group)
{
struct mdev_type *type;
@@ -158,7 +158,7 @@ static void remove_mdev_supported_type(struct mdev_type *type)
kobject_put(&type->kobj);
}

-static int add_mdev_supported_type_groups(struct parent_device *parent)
+static int add_mdev_supported_type_groups(struct mdev_parent *parent)
{
int i;

@@ -183,7 +183,7 @@ static int add_mdev_supported_type_groups(struct parent_device *parent)
}

/* mdev sysfs functions */
-void parent_remove_sysfs_files(struct parent_device *parent)
+void parent_remove_sysfs_files(struct mdev_parent *parent)
{
struct mdev_type *type, *tmp;

@@ -196,7 +196,7 @@ void parent_remove_sysfs_files(struct parent_device *parent)
kset_unregister(parent->mdev_types_kset);
}

-int parent_create_sysfs_files(struct parent_device *parent)
+int parent_create_sysfs_files(struct mdev_parent *parent)
{
int ret;

diff --git a/drivers/vfio/mdev/vfio_mdev.c b/drivers/vfio/mdev/vfio_mdev.c
index ffc3675..fa848a7 100644
--- a/drivers/vfio/mdev/vfio_mdev.c
+++ b/drivers/vfio/mdev/vfio_mdev.c
@@ -27,7 +27,7 @@
static int vfio_mdev_open(void *device_data)
{
struct mdev_device *mdev = device_data;
- struct parent_device *parent = mdev->parent;
+ struct mdev_parent *parent = mdev->parent;
int ret;

if (unlikely(!parent->ops->open))
@@ -46,7 +46,7 @@ static int vfio_mdev_open(void *device_data)
static void vfio_mdev_release(void *device_data)
{
struct mdev_device *mdev = device_data;
- struct parent_device *parent = mdev->parent;
+ struct mdev_parent *parent = mdev->parent;

if (likely(parent->ops->release))
parent->ops->release(mdev);
@@ -58,7 +58,7 @@ static long vfio_mdev_unlocked_ioctl(void *device_data,
unsigned int cmd, unsigned long arg)
{
struct mdev_device *mdev = device_data;
- struct parent_device *parent = mdev->parent;
+ struct mdev_parent *parent = mdev->parent;

if (unlikely(!parent->ops->ioctl))
return -EINVAL;
@@ -70,7 +70,7 @@ static ssize_t vfio_mdev_read(void *device_data, char __user *buf,
size_t count, loff_t *ppos)
{
struct mdev_device *mdev = device_data;
- struct parent_device *parent = mdev->parent;
+ struct mdev_parent *parent = mdev->parent;

if (unlikely(!parent->ops->read))
return -EINVAL;
@@ -82,7 +82,7 @@ static ssize_t vfio_mdev_write(void *device_data, const char __user *buf,
size_t count, loff_t *ppos)
{
struct mdev_device *mdev = device_data;
- struct parent_device *parent = mdev->parent;
+ struct mdev_parent *parent = mdev->parent;

if (unlikely(!parent->ops->write))
return -EINVAL;
@@ -93,7 +93,7 @@ static ssize_t vfio_mdev_write(void *device_data, const char __user *buf,
static int vfio_mdev_mmap(void *device_data, struct vm_area_struct *vma)
{
struct mdev_device *mdev = device_data;
- struct parent_device *parent = mdev->parent;
+ struct mdev_parent *parent = mdev->parent;

if (unlikely(!parent->ops->mmap))
return -EINVAL;
diff --git a/include/linux/mdev.h b/include/linux/mdev.h
index ec819e9..853bb78 100644
--- a/include/linux/mdev.h
+++ b/include/linux/mdev.h
@@ -14,9 +14,9 @@
#define MDEV_H

/* Parent device */
-struct parent_device {
- struct device *dev;
- const struct parent_ops *ops;
+struct mdev_parent {
+ struct device *dev;
+ const struct mdev_parent_ops *ops;

/* internal */
struct kref ref;
@@ -29,7 +29,7 @@ struct parent_device {
/* Mediated device */
struct mdev_device {
struct device dev;
- struct parent_device *parent;
+ struct mdev_parent *parent;
uuid_le uuid;
void *driver_data;

@@ -40,7 +40,7 @@ struct mdev_device {
};

/**
- * struct parent_ops - Structure to be registered for each parent device to
+ * struct mdev_parent_ops - Structure to be registered for each parent device to
* register the device to mdev module.
*
* @owner: The module owner.
@@ -86,10 +86,10 @@ struct mdev_device {
* @mdev: mediated device structure
* @vma: vma structure
* Parent device that support mediated device should be registered with mdev
- * module with parent_ops structure.
+ * module with mdev_parent_ops structure.
**/

-struct parent_ops {
+struct mdev_parent_ops {
struct module *owner;
const struct attribute_group **dev_attr_groups;
const struct attribute_group **mdev_attr_groups;
@@ -159,7 +159,7 @@ static inline void mdev_set_drvdata(struct mdev_device *mdev, void *data)
#define dev_is_mdev(d) ((d)->bus == &mdev_bus_type)

extern int mdev_register_device(struct device *dev,
- const struct parent_ops *ops);
+ const struct mdev_parent_ops *ops);
extern void mdev_unregister_device(struct device *dev);

extern int mdev_register_driver(struct mdev_driver *drv, struct module *owner);
diff --git a/samples/vfio-mdev/mtty.c b/samples/vfio-mdev/mtty.c
index 6b633a4..1a74f0e 100644
--- a/samples/vfio-mdev/mtty.c
+++ b/samples/vfio-mdev/mtty.c
@@ -1402,7 +1402,7 @@ struct attribute_group *mdev_type_groups[] = {
NULL,
};

-struct parent_ops mdev_fops = {
+struct mdev_parent_ops mdev_fops = {
.owner = THIS_MODULE,
.dev_attr_groups = mtty_dev_groups,
.mdev_attr_groups = mdev_dev_groups,

2016-12-22 20:22:02

by Alex Williamson

[permalink] [raw]
Subject: [PATCH v2 1/4] vfio-mdev: Fix remove race

Using the mtty mdev sample driver we can generate a remove race by
starting one shell that continuously creates mtty devices and several
other shells all attempting to remove devices, in my case four remove
shells. The fault occurs in mdev_remove_sysfs_files() where the
passed type arg is NULL, which suggests we've received a struct device
in mdev_device_remove() but it's in some sort of teardown state. The
solution here is to make use of the accidentally unused list_head on
the mdev_device such that the mdev core keeps a list of all the mdev
devices. This allows us to validate that we have a valid mdev before
we start removal, remove it from the list to prevent others from
working on it, and if the vendor driver refuses to remove, we can
re-add it to the list.

Cc: Kirti Wankhede <[email protected]>
Signed-off-by: Alex Williamson <[email protected]>
---
drivers/vfio/mdev/mdev_core.c | 36 ++++++++++++++++++++++++++++++++++--
1 file changed, 34 insertions(+), 2 deletions(-)

diff --git a/drivers/vfio/mdev/mdev_core.c b/drivers/vfio/mdev/mdev_core.c
index be1ee89..6bb4d4c 100644
--- a/drivers/vfio/mdev/mdev_core.c
+++ b/drivers/vfio/mdev/mdev_core.c
@@ -27,6 +27,9 @@
static DEFINE_MUTEX(parent_list_lock);
static struct class_compat *mdev_bus_compat_class;

+static LIST_HEAD(mdev_list);
+static DEFINE_MUTEX(mdev_list_lock);
+
static int _find_mdev_device(struct device *dev, void *data)
{
struct mdev_device *mdev;
@@ -316,6 +319,11 @@ int mdev_device_create(struct kobject *kobj, struct device *dev, uuid_le uuid)
dev_dbg(&mdev->dev, "MDEV: created\n");

mutex_unlock(&parent->lock);
+
+ mutex_lock(&mdev_list_lock);
+ list_add(&mdev->next, &mdev_list);
+ mutex_unlock(&mdev_list_lock);
+
return ret;

create_failed:
@@ -329,12 +337,30 @@ int mdev_device_create(struct kobject *kobj, struct device *dev, uuid_le uuid)

int mdev_device_remove(struct device *dev, bool force_remove)
{
- struct mdev_device *mdev;
+ struct mdev_device *mdev, *tmp;
struct parent_device *parent;
struct mdev_type *type;
int ret;
+ bool found = false;

mdev = to_mdev_device(dev);
+
+ mutex_lock(&mdev_list_lock);
+ list_for_each_entry(tmp, &mdev_list, next) {
+ if (tmp == mdev) {
+ found = true;
+ break;
+ }
+ }
+
+ if (found)
+ list_del(&mdev->next);
+
+ mutex_unlock(&mdev_list_lock);
+
+ if (!found)
+ return -ENODEV;
+
type = to_mdev_type(mdev->type_kobj);
parent = mdev->parent;
mutex_lock(&parent->lock);
@@ -342,6 +368,11 @@ int mdev_device_remove(struct device *dev, bool force_remove)
ret = mdev_device_remove_ops(mdev, force_remove);
if (ret) {
mutex_unlock(&parent->lock);
+
+ mutex_lock(&mdev_list_lock);
+ list_add(&mdev->next, &mdev_list);
+ mutex_unlock(&mdev_list_lock);
+
return ret;
}

@@ -349,7 +380,8 @@ int mdev_device_remove(struct device *dev, bool force_remove)
device_unregister(dev);
mutex_unlock(&parent->lock);
mdev_put_parent(parent);
- return ret;
+
+ return 0;
}

static int __init mdev_init(void)

2016-12-22 20:22:13

by Alex Williamson

[permalink] [raw]
Subject: [PATCH v2 3/4] vfio-mdev: Make mdev_parent private

Rather than hoping for good behavior by marking some elements
internal, enforce it by making the entire structure private and
creating an accessor function for the one useful external field.

Cc: Kirti Wankhede <[email protected]>
Cc: Zhenyu Wang <[email protected]>
Cc: Zhi Wang <[email protected]>
Cc: Jike Song <[email protected]>
Signed-off-by: Alex Williamson <[email protected]>
---
Documentation/vfio-mediated-device.txt | 3 +++
drivers/gpu/drm/i915/gvt/kvmgt.c | 2 +-
drivers/vfio/mdev/mdev_core.c | 6 ++++++
drivers/vfio/mdev/mdev_private.h | 10 ++++++++++
include/linux/mdev.h | 15 ++-------------
samples/vfio-mdev/mtty.c | 2 +-
6 files changed, 23 insertions(+), 15 deletions(-)

diff --git a/Documentation/vfio-mediated-device.txt b/Documentation/vfio-mediated-device.txt
index cfee106..d226c7a 100644
--- a/Documentation/vfio-mediated-device.txt
+++ b/Documentation/vfio-mediated-device.txt
@@ -223,6 +223,9 @@ Directories and files under the sysfs for Each Physical Device

sprintf(buf, "%s-%s", dev_driver_string(parent->dev), group->name);

+ (or using mdev_parent_dev(mdev) to arrive at the parent device outside
+ of the core mdev code)
+
* device_api

This attribute should show which device API is being created, for example,
diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c b/drivers/gpu/drm/i915/gvt/kvmgt.c
index 081ada2..3850032 100644
--- a/drivers/gpu/drm/i915/gvt/kvmgt.c
+++ b/drivers/gpu/drm/i915/gvt/kvmgt.c
@@ -396,7 +396,7 @@ static int intel_vgpu_create(struct kobject *kobj, struct mdev_device *mdev)
struct device *pdev;
void *gvt;

- pdev = mdev->parent->dev;
+ pdev = mdev_parent_dev(mdev);
gvt = kdev_to_i915(pdev)->gvt;

type = intel_gvt_find_vgpu_type(gvt, kobject_name(kobj));
diff --git a/drivers/vfio/mdev/mdev_core.c b/drivers/vfio/mdev/mdev_core.c
index bf3b3b0..30d0530 100644
--- a/drivers/vfio/mdev/mdev_core.c
+++ b/drivers/vfio/mdev/mdev_core.c
@@ -30,6 +30,12 @@
static LIST_HEAD(mdev_list);
static DEFINE_MUTEX(mdev_list_lock);

+struct device *mdev_parent_dev(struct mdev_device *mdev)
+{
+ return mdev->parent->dev;
+}
+EXPORT_SYMBOL(mdev_parent_dev);
+
static int _find_mdev_device(struct device *dev, void *data)
{
struct mdev_device *mdev;
diff --git a/drivers/vfio/mdev/mdev_private.h b/drivers/vfio/mdev/mdev_private.h
index 0b72c2d9..b05dd22 100644
--- a/drivers/vfio/mdev/mdev_private.h
+++ b/drivers/vfio/mdev/mdev_private.h
@@ -16,6 +16,16 @@
int mdev_bus_register(void);
void mdev_bus_unregister(void);

+struct mdev_parent {
+ struct device *dev;
+ const struct mdev_parent_ops *ops;
+ struct kref ref;
+ struct mutex lock;
+ struct list_head next;
+ struct kset *mdev_types_kset;
+ struct list_head type_list;
+};
+
struct mdev_type {
struct kobject kobj;
struct kobject *devices_kobj;
diff --git a/include/linux/mdev.h b/include/linux/mdev.h
index 853bb78..f586222 100644
--- a/include/linux/mdev.h
+++ b/include/linux/mdev.h
@@ -13,19 +13,6 @@
#ifndef MDEV_H
#define MDEV_H

-/* Parent device */
-struct mdev_parent {
- struct device *dev;
- const struct mdev_parent_ops *ops;
-
- /* internal */
- struct kref ref;
- struct mutex lock;
- struct list_head next;
- struct kset *mdev_types_kset;
- struct list_head type_list;
-};
-
/* Mediated device */
struct mdev_device {
struct device dev;
@@ -165,4 +152,6 @@ extern int mdev_register_device(struct device *dev,
extern int mdev_register_driver(struct mdev_driver *drv, struct module *owner);
extern void mdev_unregister_driver(struct mdev_driver *drv);

+extern struct device *mdev_parent_dev(struct mdev_device *mdev);
+
#endif /* MDEV_H */
diff --git a/samples/vfio-mdev/mtty.c b/samples/vfio-mdev/mtty.c
index 1a74f0e..5e13efc 100644
--- a/samples/vfio-mdev/mtty.c
+++ b/samples/vfio-mdev/mtty.c
@@ -734,7 +734,7 @@ int mtty_create(struct kobject *kobj, struct mdev_device *mdev)

for (i = 0; i < 2; i++) {
snprintf(name, MTTY_STRING_LEN, "%s-%d",
- dev_driver_string(mdev->parent->dev), i + 1);
+ dev_driver_string(mdev_parent_dev(mdev)), i + 1);
if (!strcmp(kobj->name, name)) {
nr_ports = i + 1;
break;

2016-12-22 20:22:36

by Alex Williamson

[permalink] [raw]
Subject: [PATCH v2 4/4] vfio-mdev: Make mdev_device private and abstract interfaces

Abstract access to mdev_device so that we can define which interfaces
are public rather than relying on comments in the structure.

Cc: Kirti Wankhede <[email protected]>
Cc: Zhenyu Wang <[email protected]>
Cc: Zhi Wang <[email protected]>
Signed-off-by: Alex Williamson <[email protected]>
Reviewed-by: Jike Song <[email protected]>
---
drivers/gpu/drm/i915/gvt/kvmgt.c | 18 +++++++++---------
drivers/vfio/mdev/mdev_core.c | 30 ++++++++++++++++++++++++++++++
drivers/vfio/mdev/mdev_private.h | 13 +++++++++++++
include/linux/mdev.h | 31 ++++++-------------------------
samples/vfio-mdev/mtty.c | 24 +++++++++++++-----------
5 files changed, 71 insertions(+), 45 deletions(-)

diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c b/drivers/gpu/drm/i915/gvt/kvmgt.c
index 3850032..f8021a0 100644
--- a/drivers/gpu/drm/i915/gvt/kvmgt.c
+++ b/drivers/gpu/drm/i915/gvt/kvmgt.c
@@ -166,7 +166,7 @@ static void __gvt_cache_remove_entry(struct intel_vgpu *vgpu,

static void gvt_cache_remove(struct intel_vgpu *vgpu, gfn_t gfn)
{
- struct device *dev = &vgpu->vdev.mdev->dev;
+ struct device *dev = mdev_dev(vgpu->vdev.mdev);
struct gvt_dma *this;
unsigned long g1;
int rc;
@@ -195,7 +195,7 @@ static void gvt_cache_destroy(struct intel_vgpu *vgpu)
{
struct gvt_dma *dma;
struct rb_node *node = NULL;
- struct device *dev = &vgpu->vdev.mdev->dev;
+ struct device *dev = mdev_dev(vgpu->vdev.mdev);
unsigned long gfn;

mutex_lock(&vgpu->vdev.cache_lock);
@@ -418,7 +418,7 @@ static int intel_vgpu_create(struct kobject *kobj, struct mdev_device *mdev)
mdev_set_drvdata(mdev, vgpu);

gvt_dbg_core("intel_vgpu_create succeeded for mdev: %s\n",
- dev_name(&mdev->dev));
+ dev_name(mdev_dev(mdev)));
return 0;
}

@@ -482,7 +482,7 @@ static int intel_vgpu_open(struct mdev_device *mdev)
vgpu->vdev.group_notifier.notifier_call = intel_vgpu_group_notifier;

events = VFIO_IOMMU_NOTIFY_DMA_UNMAP;
- ret = vfio_register_notifier(&mdev->dev, VFIO_IOMMU_NOTIFY, &events,
+ ret = vfio_register_notifier(mdev_dev(mdev), VFIO_IOMMU_NOTIFY, &events,
&vgpu->vdev.iommu_notifier);
if (ret != 0) {
gvt_err("vfio_register_notifier for iommu failed: %d\n", ret);
@@ -490,7 +490,7 @@ static int intel_vgpu_open(struct mdev_device *mdev)
}

events = VFIO_GROUP_NOTIFY_SET_KVM;
- ret = vfio_register_notifier(&mdev->dev, VFIO_GROUP_NOTIFY, &events,
+ ret = vfio_register_notifier(mdev_dev(mdev), VFIO_GROUP_NOTIFY, &events,
&vgpu->vdev.group_notifier);
if (ret != 0) {
gvt_err("vfio_register_notifier for group failed: %d\n", ret);
@@ -500,7 +500,7 @@ static int intel_vgpu_open(struct mdev_device *mdev)
return kvmgt_guest_init(mdev);

undo_iommu:
- vfio_unregister_notifier(&mdev->dev, VFIO_IOMMU_NOTIFY,
+ vfio_unregister_notifier(mdev_dev(mdev), VFIO_IOMMU_NOTIFY,
&vgpu->vdev.iommu_notifier);
out:
return ret;
@@ -513,9 +513,9 @@ static void __intel_vgpu_release(struct intel_vgpu *vgpu)
if (!handle_valid(vgpu->handle))
return;

- vfio_unregister_notifier(&vgpu->vdev.mdev->dev, VFIO_IOMMU_NOTIFY,
+ vfio_unregister_notifier(mdev_dev(vgpu->vdev.mdev), VFIO_IOMMU_NOTIFY,
&vgpu->vdev.iommu_notifier);
- vfio_unregister_notifier(&vgpu->vdev.mdev->dev, VFIO_GROUP_NOTIFY,
+ vfio_unregister_notifier(mdev_dev(vgpu->vdev.mdev), VFIO_GROUP_NOTIFY,
&vgpu->vdev.group_notifier);

info = (struct kvmgt_guest_info *)vgpu->handle;
@@ -1372,7 +1372,7 @@ static unsigned long kvmgt_gfn_to_pfn(unsigned long handle, unsigned long gfn)
return pfn;

pfn = INTEL_GVT_INVALID_ADDR;
- dev = &info->vgpu->vdev.mdev->dev;
+ dev = mdev_dev(info->vgpu->vdev.mdev);
rc = vfio_pin_pages(dev, &gfn, 1, IOMMU_READ | IOMMU_WRITE, &pfn);
if (rc != 1) {
gvt_err("vfio_pin_pages failed for gfn 0x%lx: %d\n", gfn, rc);
diff --git a/drivers/vfio/mdev/mdev_core.c b/drivers/vfio/mdev/mdev_core.c
index 30d0530..36d75c3 100644
--- a/drivers/vfio/mdev/mdev_core.c
+++ b/drivers/vfio/mdev/mdev_core.c
@@ -36,6 +36,36 @@ struct device *mdev_parent_dev(struct mdev_device *mdev)
}
EXPORT_SYMBOL(mdev_parent_dev);

+void *mdev_get_drvdata(struct mdev_device *mdev)
+{
+ return mdev->driver_data;
+}
+EXPORT_SYMBOL(mdev_get_drvdata);
+
+void mdev_set_drvdata(struct mdev_device *mdev, void *data)
+{
+ mdev->driver_data = data;
+}
+EXPORT_SYMBOL(mdev_set_drvdata);
+
+struct device *mdev_dev(struct mdev_device *mdev)
+{
+ return &mdev->dev;
+}
+EXPORT_SYMBOL(mdev_dev);
+
+struct mdev_device *mdev_from_dev(struct device *dev)
+{
+ return dev_is_mdev(dev) ? to_mdev_device(dev) : NULL;
+}
+EXPORT_SYMBOL(mdev_from_dev);
+
+uuid_le mdev_uuid(struct mdev_device *mdev)
+{
+ return mdev->uuid;
+}
+EXPORT_SYMBOL(mdev_uuid);
+
static int _find_mdev_device(struct device *dev, void *data)
{
struct mdev_device *mdev;
diff --git a/drivers/vfio/mdev/mdev_private.h b/drivers/vfio/mdev/mdev_private.h
index b05dd22..a9cefd7 100644
--- a/drivers/vfio/mdev/mdev_private.h
+++ b/drivers/vfio/mdev/mdev_private.h
@@ -26,6 +26,19 @@ struct mdev_parent {
struct list_head type_list;
};

+struct mdev_device {
+ struct device dev;
+ struct mdev_parent *parent;
+ uuid_le uuid;
+ void *driver_data;
+ struct kref ref;
+ struct list_head next;
+ struct kobject *type_kobj;
+};
+
+#define to_mdev_device(dev) container_of(dev, struct mdev_device, dev)
+#define dev_is_mdev(d) ((d)->bus == &mdev_bus_type)
+
struct mdev_type {
struct kobject kobj;
struct kobject *devices_kobj;
diff --git a/include/linux/mdev.h b/include/linux/mdev.h
index f586222..3ee44b8 100644
--- a/include/linux/mdev.h
+++ b/include/linux/mdev.h
@@ -13,18 +13,7 @@
#ifndef MDEV_H
#define MDEV_H

-/* Mediated device */
-struct mdev_device {
- struct device dev;
- struct mdev_parent *parent;
- uuid_le uuid;
- void *driver_data;
-
- /* internal */
- struct kref ref;
- struct list_head next;
- struct kobject *type_kobj;
-};
+struct mdev_device;

/**
* struct mdev_parent_ops - Structure to be registered for each parent device to
@@ -75,7 +64,6 @@ struct mdev_device {
* Parent device that support mediated device should be registered with mdev
* module with mdev_parent_ops structure.
**/
-
struct mdev_parent_ops {
struct module *owner;
const struct attribute_group **dev_attr_groups;
@@ -129,22 +117,13 @@ struct mdev_driver {
};

#define to_mdev_driver(drv) container_of(drv, struct mdev_driver, driver)
-#define to_mdev_device(dev) container_of(dev, struct mdev_device, dev)

-static inline void *mdev_get_drvdata(struct mdev_device *mdev)
-{
- return mdev->driver_data;
-}
-
-static inline void mdev_set_drvdata(struct mdev_device *mdev, void *data)
-{
- mdev->driver_data = data;
-}
+extern void *mdev_get_drvdata(struct mdev_device *mdev);
+extern void mdev_set_drvdata(struct mdev_device *mdev, void *data);
+extern uuid_le mdev_uuid(struct mdev_device *mdev);

extern struct bus_type mdev_bus_type;

-#define dev_is_mdev(d) ((d)->bus == &mdev_bus_type)
-
extern int mdev_register_device(struct device *dev,
const struct mdev_parent_ops *ops);
extern void mdev_unregister_device(struct device *dev);
@@ -153,5 +132,7 @@ extern int mdev_register_device(struct device *dev,
extern void mdev_unregister_driver(struct mdev_driver *drv);

extern struct device *mdev_parent_dev(struct mdev_device *mdev);
+extern struct device *mdev_dev(struct mdev_device *mdev);
+extern struct mdev_device *mdev_from_dev(struct device *dev);

#endif /* MDEV_H */
diff --git a/samples/vfio-mdev/mtty.c b/samples/vfio-mdev/mtty.c
index 5e13efc..919c10d 100644
--- a/samples/vfio-mdev/mtty.c
+++ b/samples/vfio-mdev/mtty.c
@@ -164,7 +164,7 @@ static struct mdev_state *find_mdev_state_by_uuid(uuid_le uuid)
struct mdev_state *mds;

list_for_each_entry(mds, &mdev_devices_list, next) {
- if (uuid_le_cmp(mds->mdev->uuid, uuid) == 0)
+ if (uuid_le_cmp(mdev_uuid(mds->mdev), uuid) == 0)
return mds;
}

@@ -341,7 +341,8 @@ static void handle_bar_write(unsigned int index, struct mdev_state *mdev_state,
pr_err("Serial port %d: Fifo level trigger\n",
index);
#endif
- mtty_trigger_interrupt(mdev_state->mdev->uuid);
+ mtty_trigger_interrupt(
+ mdev_uuid(mdev_state->mdev));
}
} else {
#if defined(DEBUG_INTR)
@@ -355,7 +356,8 @@ static void handle_bar_write(unsigned int index, struct mdev_state *mdev_state,
*/
if (mdev_state->s[index].uart_reg[UART_IER] &
UART_IER_RLSI)
- mtty_trigger_interrupt(mdev_state->mdev->uuid);
+ mtty_trigger_interrupt(
+ mdev_uuid(mdev_state->mdev));
}
mutex_unlock(&mdev_state->rxtx_lock);
break;
@@ -374,7 +376,8 @@ static void handle_bar_write(unsigned int index, struct mdev_state *mdev_state,
pr_err("Serial port %d: IER_THRI write\n",
index);
#endif
- mtty_trigger_interrupt(mdev_state->mdev->uuid);
+ mtty_trigger_interrupt(
+ mdev_uuid(mdev_state->mdev));
}

mutex_unlock(&mdev_state->rxtx_lock);
@@ -445,7 +448,7 @@ static void handle_bar_write(unsigned int index, struct mdev_state *mdev_state,
#if defined(DEBUG_INTR)
pr_err("Serial port %d: MCR_OUT2 write\n", index);
#endif
- mtty_trigger_interrupt(mdev_state->mdev->uuid);
+ mtty_trigger_interrupt(mdev_uuid(mdev_state->mdev));
}

if ((mdev_state->s[index].uart_reg[UART_IER] & UART_IER_MSI) &&
@@ -453,7 +456,7 @@ static void handle_bar_write(unsigned int index, struct mdev_state *mdev_state,
#if defined(DEBUG_INTR)
pr_err("Serial port %d: MCR RTS/DTR write\n", index);
#endif
- mtty_trigger_interrupt(mdev_state->mdev->uuid);
+ mtty_trigger_interrupt(mdev_uuid(mdev_state->mdev));
}
break;

@@ -504,7 +507,8 @@ static void handle_bar_read(unsigned int index, struct mdev_state *mdev_state,
#endif
if (mdev_state->s[index].uart_reg[UART_IER] &
UART_IER_THRI)
- mtty_trigger_interrupt(mdev_state->mdev->uuid);
+ mtty_trigger_interrupt(
+ mdev_uuid(mdev_state->mdev));
}
mutex_unlock(&mdev_state->rxtx_lock);

@@ -1298,10 +1302,8 @@ void mtty_close(struct mdev_device *mdev)
sample_mdev_dev_show(struct device *dev, struct device_attribute *attr,
char *buf)
{
- struct mdev_device *mdev = to_mdev_device(dev);
-
- if (mdev)
- return sprintf(buf, "This is MDEV %s\n", dev_name(&mdev->dev));
+ if (mdev_from_dev(dev))
+ return sprintf(buf, "This is MDEV %s\n", dev_name(dev));

return sprintf(buf, "\n");
}

2016-12-25 17:10:08

by Kirti Wankhede

[permalink] [raw]
Subject: Re: [PATCH v2 1/4] vfio-mdev: Fix remove race



On 12/23/2016 1:51 AM, Alex Williamson wrote:
> Using the mtty mdev sample driver we can generate a remove race by
> starting one shell that continuously creates mtty devices and several
> other shells all attempting to remove devices, in my case four remove
> shells. The fault occurs in mdev_remove_sysfs_files() where the
> passed type arg is NULL, which suggests we've received a struct device
> in mdev_device_remove() but it's in some sort of teardown state. The
> solution here is to make use of the accidentally unused list_head on
> the mdev_device such that the mdev core keeps a list of all the mdev
> devices. This allows us to validate that we have a valid mdev before
> we start removal, remove it from the list to prevent others from
> working on it, and if the vendor driver refuses to remove, we can
> re-add it to the list.
>

Alex,

Writing 1 on 'remove' first removes itself, i.e. calls
device_remove_file_self(dev, attr). So if the file is removed then
device_remove_file_self() should return false, isn't that returns false?
kernfs_remove_self() hold the mutex that should handle this condition.

Thanks,
Kirti.


> Cc: Kirti Wankhede <[email protected]>
> Signed-off-by: Alex Williamson <[email protected]>
> ---
> drivers/vfio/mdev/mdev_core.c | 36 ++++++++++++++++++++++++++++++++++--
> 1 file changed, 34 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/vfio/mdev/mdev_core.c b/drivers/vfio/mdev/mdev_core.c
> index be1ee89..6bb4d4c 100644
> --- a/drivers/vfio/mdev/mdev_core.c
> +++ b/drivers/vfio/mdev/mdev_core.c
> @@ -27,6 +27,9 @@
> static DEFINE_MUTEX(parent_list_lock);
> static struct class_compat *mdev_bus_compat_class;
>
> +static LIST_HEAD(mdev_list);
> +static DEFINE_MUTEX(mdev_list_lock);
> +
> static int _find_mdev_device(struct device *dev, void *data)
> {
> struct mdev_device *mdev;
> @@ -316,6 +319,11 @@ int mdev_device_create(struct kobject *kobj, struct device *dev, uuid_le uuid)
> dev_dbg(&mdev->dev, "MDEV: created\n");
>
> mutex_unlock(&parent->lock);
> +
> + mutex_lock(&mdev_list_lock);
> + list_add(&mdev->next, &mdev_list);
> + mutex_unlock(&mdev_list_lock);
> +
> return ret;
>
> create_failed:
> @@ -329,12 +337,30 @@ int mdev_device_create(struct kobject *kobj, struct device *dev, uuid_le uuid)
>
> int mdev_device_remove(struct device *dev, bool force_remove)
> {
> - struct mdev_device *mdev;
> + struct mdev_device *mdev, *tmp;
> struct parent_device *parent;
> struct mdev_type *type;
> int ret;
> + bool found = false;
>
> mdev = to_mdev_device(dev);
> +
> + mutex_lock(&mdev_list_lock);
> + list_for_each_entry(tmp, &mdev_list, next) {
> + if (tmp == mdev) {
> + found = true;
> + break;
> + }
> + }
> +
> + if (found)
> + list_del(&mdev->next);
> +
> + mutex_unlock(&mdev_list_lock);
> +
> + if (!found)
> + return -ENODEV;
> +
> type = to_mdev_type(mdev->type_kobj);
> parent = mdev->parent;
> mutex_lock(&parent->lock);
> @@ -342,6 +368,11 @@ int mdev_device_remove(struct device *dev, bool force_remove)
> ret = mdev_device_remove_ops(mdev, force_remove);
> if (ret) {
> mutex_unlock(&parent->lock);
> +
> + mutex_lock(&mdev_list_lock);
> + list_add(&mdev->next, &mdev_list);
> + mutex_unlock(&mdev_list_lock);
> +
> return ret;
> }
>
> @@ -349,7 +380,8 @@ int mdev_device_remove(struct device *dev, bool force_remove)
> device_unregister(dev);
> mutex_unlock(&parent->lock);
> mdev_put_parent(parent);
> - return ret;
> +
> + return 0;
> }
>
> static int __init mdev_init(void)
>

2016-12-25 19:41:01

by Alex Williamson

[permalink] [raw]
Subject: Re: [PATCH v2 1/4] vfio-mdev: Fix remove race

On Sun, 25 Dec 2016 22:39:47 +0530
Kirti Wankhede <[email protected]> wrote:

> On 12/23/2016 1:51 AM, Alex Williamson wrote:
> > Using the mtty mdev sample driver we can generate a remove race by
> > starting one shell that continuously creates mtty devices and several
> > other shells all attempting to remove devices, in my case four remove
> > shells. The fault occurs in mdev_remove_sysfs_files() where the
> > passed type arg is NULL, which suggests we've received a struct device
> > in mdev_device_remove() but it's in some sort of teardown state. The
> > solution here is to make use of the accidentally unused list_head on
> > the mdev_device such that the mdev core keeps a list of all the mdev
> > devices. This allows us to validate that we have a valid mdev before
> > we start removal, remove it from the list to prevent others from
> > working on it, and if the vendor driver refuses to remove, we can
> > re-add it to the list.
> >
>
> Alex,
>
> Writing 1 on 'remove' first removes itself, i.e. calls
> device_remove_file_self(dev, attr). So if the file is removed then
> device_remove_file_self() should return false, isn't that returns false?
> kernfs_remove_self() hold the mutex that should handle this condition.

In theory, I agree. In practice I was able to generate the race
described. We're getting through to call mdev_device_remove with
a struct device that resolves to an mdev where the type_kobj is
NULL, presumably it's been freed. Maybe there's a better fix
within kernfs, but this sanitizes the mdev on our end to resolve
it. To see the issue, simply run 'while true; do uuidgen >
create; done', then from a few other shells loop finding mdev
devices and remove any that are found. Set dmesg to only print
critical messages or else it'll slow create and delete to the
point where it'll be difficult to get the race. Thanks,

Alex

> > Cc: Kirti Wankhede <[email protected]>
> > Signed-off-by: Alex Williamson <[email protected]>
> > ---
> > drivers/vfio/mdev/mdev_core.c | 36 ++++++++++++++++++++++++++++++++++--
> > 1 file changed, 34 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/vfio/mdev/mdev_core.c b/drivers/vfio/mdev/mdev_core.c
> > index be1ee89..6bb4d4c 100644
> > --- a/drivers/vfio/mdev/mdev_core.c
> > +++ b/drivers/vfio/mdev/mdev_core.c
> > @@ -27,6 +27,9 @@
> > static DEFINE_MUTEX(parent_list_lock);
> > static struct class_compat *mdev_bus_compat_class;
> >
> > +static LIST_HEAD(mdev_list);
> > +static DEFINE_MUTEX(mdev_list_lock);
> > +
> > static int _find_mdev_device(struct device *dev, void *data)
> > {
> > struct mdev_device *mdev;
> > @@ -316,6 +319,11 @@ int mdev_device_create(struct kobject *kobj, struct device *dev, uuid_le uuid)
> > dev_dbg(&mdev->dev, "MDEV: created\n");
> >
> > mutex_unlock(&parent->lock);
> > +
> > + mutex_lock(&mdev_list_lock);
> > + list_add(&mdev->next, &mdev_list);
> > + mutex_unlock(&mdev_list_lock);
> > +
> > return ret;
> >
> > create_failed:
> > @@ -329,12 +337,30 @@ int mdev_device_create(struct kobject *kobj, struct device *dev, uuid_le uuid)
> >
> > int mdev_device_remove(struct device *dev, bool force_remove)
> > {
> > - struct mdev_device *mdev;
> > + struct mdev_device *mdev, *tmp;
> > struct parent_device *parent;
> > struct mdev_type *type;
> > int ret;
> > + bool found = false;
> >
> > mdev = to_mdev_device(dev);
> > +
> > + mutex_lock(&mdev_list_lock);
> > + list_for_each_entry(tmp, &mdev_list, next) {
> > + if (tmp == mdev) {
> > + found = true;
> > + break;
> > + }
> > + }
> > +
> > + if (found)
> > + list_del(&mdev->next);
> > +
> > + mutex_unlock(&mdev_list_lock);
> > +
> > + if (!found)
> > + return -ENODEV;
> > +
> > type = to_mdev_type(mdev->type_kobj);
> > parent = mdev->parent;
> > mutex_lock(&parent->lock);
> > @@ -342,6 +368,11 @@ int mdev_device_remove(struct device *dev, bool force_remove)
> > ret = mdev_device_remove_ops(mdev, force_remove);
> > if (ret) {
> > mutex_unlock(&parent->lock);
> > +
> > + mutex_lock(&mdev_list_lock);
> > + list_add(&mdev->next, &mdev_list);
> > + mutex_unlock(&mdev_list_lock);
> > +
> > return ret;
> > }
> >
> > @@ -349,7 +380,8 @@ int mdev_device_remove(struct device *dev, bool force_remove)
> > device_unregister(dev);
> > mutex_unlock(&parent->lock);
> > mdev_put_parent(parent);
> > - return ret;
> > +
> > + return 0;
> > }
> >
> > static int __init mdev_init(void)
> >

2016-12-26 03:29:10

by Kirti Wankhede

[permalink] [raw]
Subject: Re: [PATCH v2 1/4] vfio-mdev: Fix remove race



On 12/26/2016 1:10 AM, Alex Williamson wrote:
> On Sun, 25 Dec 2016 22:39:47 +0530
> Kirti Wankhede <[email protected]> wrote:
>
>> On 12/23/2016 1:51 AM, Alex Williamson wrote:
>>> Using the mtty mdev sample driver we can generate a remove race by
>>> starting one shell that continuously creates mtty devices and several
>>> other shells all attempting to remove devices, in my case four remove
>>> shells. The fault occurs in mdev_remove_sysfs_files() where the
>>> passed type arg is NULL, which suggests we've received a struct device
>>> in mdev_device_remove() but it's in some sort of teardown state. The
>>> solution here is to make use of the accidentally unused list_head on
>>> the mdev_device such that the mdev core keeps a list of all the mdev
>>> devices. This allows us to validate that we have a valid mdev before
>>> we start removal, remove it from the list to prevent others from
>>> working on it, and if the vendor driver refuses to remove, we can
>>> re-add it to the list.
>>>
>>
>> Alex,
>>
>> Writing 1 on 'remove' first removes itself, i.e. calls
>> device_remove_file_self(dev, attr). So if the file is removed then
>> device_remove_file_self() should return false, isn't that returns false?
>> kernfs_remove_self() hold the mutex that should handle this condition.
>
> In theory, I agree. In practice I was able to generate the race
> described. We're getting through to call mdev_device_remove with
> a struct device that resolves to an mdev where the type_kobj is
> NULL, presumably it's been freed. Maybe there's a better fix
> within kernfs, but this sanitizes the mdev on our end to resolve
> it. To see the issue, simply run 'while true; do uuidgen >
> create; done', then from a few other shells loop finding mdev
> devices and remove any that are found. Set dmesg to only print
> critical messages or else it'll slow create and delete to the
> point where it'll be difficult to get the race. Thanks,
>

I see. pci-sysfs too uses mutex around its remove function even after
device_remove_file_self() returned true. Yes, probably kernfs might have
better fix.
This change looks good to me.

Thanks,
Kirti

2016-12-26 03:31:47

by Kirti Wankhede

[permalink] [raw]
Subject: Re: [PATCH v2 0/4] vfio-mdev: Fix remove race, clean namespace and better define ABI


On 12/23/2016 1:51 AM, Alex Williamson wrote:
> Cleanup the namespace a bit by prefixing structures with mdev_ and
> also more concretely define the mdev interface. Structs with comments
> defining which fields are private vs public tempts poor behavior,
> especially for an interface where we expect out of tree vendor drivers.
>

Patch 2/4-4/4 looks good to me.

Reviewed by: Kirti Wankhede <[email protected]>

Thanks,
Kirti

> Additionally in v2, the patch removing the next field from mdev_device
> is dropped, instead using it to fix a remove race, my From address is
> fixed, and Documentation is updated. Jike, I left your R-b on the
> patches that didn't change only. If I've missed any relevant doc
> updates, please let me know. Thanks,
>
> Alex
>
> ---
>
> Alex Williamson (4):
> vfio-mdev: Fix remove race
> vfio-mdev: de-polute the namespace, rename parent_device & parent_ops
> vfio-mdev: Make mdev_parent private
> vfio-mdev: Make mdev_device private and abstract interfaces
>
>
> Documentation/vfio-mediated-device.txt | 27 +++++----
> drivers/gpu/drm/i915/gvt/kvmgt.c | 22 ++++---
> drivers/vfio/mdev/mdev_core.c | 100 +++++++++++++++++++++++++++-----
> drivers/vfio/mdev/mdev_private.h | 29 ++++++++-
> drivers/vfio/mdev/mdev_sysfs.c | 8 +--
> drivers/vfio/mdev/vfio_mdev.c | 12 ++--
> include/linux/mdev.h | 54 ++++-------------
> samples/vfio-mdev/mtty.c | 28 +++++----
> 8 files changed, 173 insertions(+), 107 deletions(-)
>