v2:
* Added received acks and reviews, thanks!
* Rebased and resolved conflict in patch 2/, dropped reviews due
to changes and added Alexey to cc as spapr code is moved too
* Added stable tag for patches 1-3
* Resolved comment typo Eric noted in patch 1
* Split AMBA out to patches 8 & 9 as Eric noted amba_bustype is
not exported. These can be separate follow-up patches if delayed
Please re-ack/review patch 2. Eric, I'm happy to add your Tested-by
to the whole series if appropriate as well. Thanks,
Alex
v1:
VM hotplug testing reveals a number of races in the vfio device,
group, container shutdown path, some attributed to libvirt's ask/take
unplug behavior and some long standing with groups potentially
composed of multiple devices, where each device can be independently
bound to drivers. Libvirt's ask/take behavior is a result of the
asynchronous nature of PCI hotplug, libvirt registers a hot-unplug
request (ask), which is acknowledged almost immediately and then
proceeds to try to unbind the device from the vfio bus driver (take).
This sets us off on racing paths where we allow the device to be
released from the group much like would happen in groups with multiple
devices, while the group and container are torn down separately.
These races are addressed in the first 3 patches of this series.
The long standing issue with removing devices from in-use groups is
that we feel that the system is compromised if we allow user and host
devices within the same non-isolated group. This triggers a BUG_ON
when we detect this condition after the rogue driver binding. Since
that code was put in place we've added driver_override support for
all of the physical buses supported by vfio, giving us a way to block
binding to such compromising drivers. We finally enable that in the
latter 4 patches of this series, minding that we need to allow
re-binding to non-compromising drivers, and also noting that a small
synchronization stall is effective in eliminating the need for this
blocking in the more common singleton device group case.
Reviews, comments, and acks appreciated. Thanks,
Alex
---
Alex Williamson (9):
vfio: Fix group release deadlock
kvm-vfio: Decouple only when we match a group
vfio: New external user group/file match
iommu: Add driver-not-bound notification
vfio: Create interface for vfio bus drivers to register
vfio: Register pci, platform, amba, and mdev bus drivers
vfio: Use driver_override to avert binding to compromising drivers
amba: Export amba_bustype
vfio: Add AMBA driver_override support
drivers/amba/bus.c | 1
drivers/iommu/iommu.c | 3
drivers/vfio/mdev/vfio_mdev.c | 13 ++
drivers/vfio/pci/vfio_pci.c | 7 +
drivers/vfio/platform/vfio_amba.c | 24 +++
drivers/vfio/platform/vfio_platform.c | 24 +++
drivers/vfio/vfio.c | 252 ++++++++++++++++++++++++++++++++-
include/linux/iommu.h | 1
include/linux/vfio.h | 5 +
virt/kvm/vfio.c | 39 +++--
10 files changed, 343 insertions(+), 26 deletions(-)
If vfio_iommu_group_notifier() acquires a group reference and that
reference becomes the last reference to the group, then vfio_group_put
introduces a deadlock code path where we're trying to unregister from
the iommu notifier chain from within a callout of that chain. Use a
work_struct to release this reference asynchronously.
Signed-off-by: Alex Williamson <[email protected]>
Reviewed-by: Eric Auger <[email protected]>
Cc: [email protected]
---
drivers/vfio/vfio.c | 37 ++++++++++++++++++++++++++++++++++++-
1 file changed, 36 insertions(+), 1 deletion(-)
diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 6a49485eb49d..54dd2fbf83d9 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -423,6 +423,34 @@ static void vfio_group_put(struct vfio_group *group)
kref_put_mutex(&group->kref, vfio_group_release, &vfio.group_lock);
}
+struct vfio_group_put_work {
+ struct work_struct work;
+ struct vfio_group *group;
+};
+
+static void vfio_group_put_bg(struct work_struct *work)
+{
+ struct vfio_group_put_work *do_work;
+
+ do_work = container_of(work, struct vfio_group_put_work, work);
+
+ vfio_group_put(do_work->group);
+ kfree(do_work);
+}
+
+static void vfio_group_schedule_put(struct vfio_group *group)
+{
+ struct vfio_group_put_work *do_work;
+
+ do_work = kmalloc(sizeof(*do_work), GFP_KERNEL);
+ if (WARN_ON(!do_work))
+ return;
+
+ INIT_WORK(&do_work->work, vfio_group_put_bg);
+ do_work->group = group;
+ schedule_work(&do_work->work);
+}
+
/* Assume group_lock or group reference is held */
static void vfio_group_get(struct vfio_group *group)
{
@@ -762,7 +790,14 @@ static int vfio_iommu_group_notifier(struct notifier_block *nb,
break;
}
- vfio_group_put(group);
+ /*
+ * If we're the last reference to the group, the group will be
+ * released, which includes unregistering the iommu group notifier.
+ * We hold a read-lock on that notifier list, unregistering needs
+ * a write-lock... deadlock. Release our reference asynchronously
+ * to avoid that situation.
+ */
+ vfio_group_schedule_put(group);
return NOTIFY_OK;
}
Unset-KVM and decrement-assignment only when we find the group in our
list. Otherwise we can get out of sync if the user triggers this for
groups that aren't currently on our list.
Signed-off-by: Alex Williamson <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Eric Auger <[email protected]>
Cc: Alexey Kardashevskiy <[email protected]>
Cc: [email protected]
---
virt/kvm/vfio.c | 12 +++++-------
1 file changed, 5 insertions(+), 7 deletions(-)
diff --git a/virt/kvm/vfio.c b/virt/kvm/vfio.c
index 37d9118fd84b..f1b0b7bca9a9 100644
--- a/virt/kvm/vfio.c
+++ b/virt/kvm/vfio.c
@@ -246,21 +246,19 @@ static int kvm_vfio_set_group(struct kvm_device *dev, long attr, u64 arg)
continue;
list_del(&kvg->node);
+ kvm_arch_end_assignment(dev->kvm);
+#ifdef CONFIG_SPAPR_TCE_IOMMU
+ kvm_spapr_tce_release_vfio_group(dev->kvm, vfio_group);
+#endif
+ kvm_vfio_group_set_kvm(kvg->vfio_group, NULL);
kvm_vfio_group_put_external_user(kvg->vfio_group);
kfree(kvg);
ret = 0;
break;
}
- kvm_arch_end_assignment(dev->kvm);
-
mutex_unlock(&kv->lock);
-#ifdef CONFIG_SPAPR_TCE_IOMMU
- kvm_spapr_tce_release_vfio_group(dev->kvm, vfio_group);
-#endif
- kvm_vfio_group_set_kvm(vfio_group, NULL);
-
kvm_vfio_group_put_external_user(vfio_group);
kvm_vfio_update_coherency(dev);
At the point where the kvm-vfio pseudo device wants to release its
vfio group reference, we can't always acquire a new reference to make
that happen. The group can be in a state where we wouldn't allow a
new reference to be added. This new helper function allows a caller
to match a file to a group to facilitate this. Given a file and
group, report if they match. Thus the caller needs to already have a
group reference to match to the file. This allows the deletion of a
group without acquiring a new reference.
Signed-off-by: Alex Williamson <[email protected]>
Reviewed-by: Eric Auger <[email protected]>
Reviewed-by: Paolo Bonzini <[email protected]>
Cc: [email protected]
---
drivers/vfio/vfio.c | 9 +++++++++
include/linux/vfio.h | 2 ++
virt/kvm/vfio.c | 27 +++++++++++++++++++--------
3 files changed, 30 insertions(+), 8 deletions(-)
diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 54dd2fbf83d9..7597a377eb4e 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -1776,6 +1776,15 @@ void vfio_group_put_external_user(struct vfio_group *group)
}
EXPORT_SYMBOL_GPL(vfio_group_put_external_user);
+bool vfio_external_group_match_file(struct vfio_group *test_group,
+ struct file *filep)
+{
+ struct vfio_group *group = filep->private_data;
+
+ return (filep->f_op == &vfio_group_fops) && (group == test_group);
+}
+EXPORT_SYMBOL_GPL(vfio_external_group_match_file);
+
int vfio_external_user_iommu_id(struct vfio_group *group)
{
return iommu_group_id(group->iommu_group);
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index edf9b2cad277..9b34d0af5d27 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -97,6 +97,8 @@ extern void vfio_unregister_iommu_driver(
*/
extern struct vfio_group *vfio_group_get_external_user(struct file *filep);
extern void vfio_group_put_external_user(struct vfio_group *group);
+extern bool vfio_external_group_match_file(struct vfio_group *group,
+ struct file *filep);
extern int vfio_external_user_iommu_id(struct vfio_group *group);
extern long vfio_external_check_extension(struct vfio_group *group,
unsigned long arg);
diff --git a/virt/kvm/vfio.c b/virt/kvm/vfio.c
index f1b0b7bca9a9..9aba73127aac 100644
--- a/virt/kvm/vfio.c
+++ b/virt/kvm/vfio.c
@@ -51,6 +51,22 @@ static struct vfio_group *kvm_vfio_group_get_external_user(struct file *filep)
return vfio_group;
}
+static bool kvm_vfio_external_group_match_file(struct vfio_group *group,
+ struct file *filep)
+{
+ bool ret, (*fn)(struct vfio_group *, struct file *);
+
+ fn = symbol_get(vfio_external_group_match_file);
+ if (!fn)
+ return false;
+
+ ret = fn(group, filep);
+
+ symbol_put(vfio_external_group_match_file);
+
+ return ret;
+}
+
static void kvm_vfio_group_put_external_user(struct vfio_group *vfio_group)
{
void (*fn)(struct vfio_group *);
@@ -231,18 +247,13 @@ static int kvm_vfio_set_group(struct kvm_device *dev, long attr, u64 arg)
if (!f.file)
return -EBADF;
- vfio_group = kvm_vfio_group_get_external_user(f.file);
- fdput(f);
-
- if (IS_ERR(vfio_group))
- return PTR_ERR(vfio_group);
-
ret = -ENOENT;
mutex_lock(&kv->lock);
list_for_each_entry(kvg, &kv->group_list, node) {
- if (kvg->vfio_group != vfio_group)
+ if (!kvm_vfio_external_group_match_file(kvg->vfio_group,
+ f.file))
continue;
list_del(&kvg->node);
@@ -259,7 +270,7 @@ static int kvm_vfio_set_group(struct kvm_device *dev, long attr, u64 arg)
mutex_unlock(&kv->lock);
- kvm_vfio_group_put_external_user(vfio_group);
+ fdput(f);
kvm_vfio_update_coherency(dev);
Generally we don't know about vfio bus drivers until a device is
added to the vfio-core with vfio_add_group_dev(), this optional
registration with vfio_register_bus_driver() allows vfio-core to
track known drivers. Our current use for this information is to
know whether a driver is vfio compatible during a bind operation.
For devices on buses with driver_override support, we can use this
linkage to block non-vfio drivers from binding to devices where
the iommu group state would trigger a BUG to avoid host/user
integrity issues.
Signed-off-by: Alex Williamson <[email protected]>
Reviewed-by: Eric Auger <[email protected]>
---
drivers/vfio/vfio.c | 55 ++++++++++++++++++++++++++++++++++++++++++++++++++
include/linux/vfio.h | 3 +++
2 files changed, 58 insertions(+)
diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 7597a377eb4e..f40d1508d368 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -43,6 +43,8 @@
struct class *class;
struct list_head iommu_drivers_list;
struct mutex iommu_drivers_lock;
+ struct list_head bus_drivers_list;
+ struct mutex bus_drivers_lock;
struct list_head group_list;
struct idr group_idr;
struct mutex group_lock;
@@ -51,6 +53,11 @@
wait_queue_head_t release_q;
} vfio;
+struct vfio_bus_driver {
+ struct device_driver *drv;
+ struct list_head vfio_next;
+};
+
struct vfio_iommu_driver {
const struct vfio_iommu_driver_ops *ops;
struct list_head vfio_next;
@@ -2243,6 +2250,52 @@ int vfio_unregister_notifier(struct device *dev, enum vfio_notify_type type,
}
EXPORT_SYMBOL(vfio_unregister_notifier);
+int vfio_register_bus_driver(struct device_driver *drv)
+{
+ struct vfio_bus_driver *driver, *tmp;
+
+ driver = kzalloc(sizeof(*driver), GFP_KERNEL);
+ if (!driver)
+ return -ENOMEM;
+
+ driver->drv = drv;
+
+ mutex_lock(&vfio.bus_drivers_lock);
+
+ /* Check for duplicates */
+ list_for_each_entry(tmp, &vfio.bus_drivers_list, vfio_next) {
+ if (tmp->drv == drv) {
+ mutex_unlock(&vfio.bus_drivers_lock);
+ kfree(driver);
+ return -EINVAL;
+ }
+ }
+
+ list_add(&driver->vfio_next, &vfio.bus_drivers_list);
+
+ mutex_unlock(&vfio.bus_drivers_lock);
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(vfio_register_bus_driver);
+
+void vfio_unregister_bus_driver(struct device_driver *drv)
+{
+ struct vfio_bus_driver *driver;
+
+ mutex_lock(&vfio.bus_drivers_lock);
+ list_for_each_entry(driver, &vfio.bus_drivers_list, vfio_next) {
+ if (driver->drv == drv) {
+ list_del(&driver->vfio_next);
+ mutex_unlock(&vfio.bus_drivers_lock);
+ kfree(driver);
+ return;
+ }
+ }
+ mutex_unlock(&vfio.bus_drivers_lock);
+}
+EXPORT_SYMBOL_GPL(vfio_unregister_bus_driver);
+
/**
* Module/class support
*/
@@ -2266,8 +2319,10 @@ static int __init vfio_init(void)
idr_init(&vfio.group_idr);
mutex_init(&vfio.group_lock);
mutex_init(&vfio.iommu_drivers_lock);
+ mutex_init(&vfio.bus_drivers_lock);
INIT_LIST_HEAD(&vfio.group_list);
INIT_LIST_HEAD(&vfio.iommu_drivers_list);
+ INIT_LIST_HEAD(&vfio.bus_drivers_list);
init_waitqueue_head(&vfio.release_q);
ret = misc_register(&vfio_dev);
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index 9b34d0af5d27..dab0f8105e4a 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -92,6 +92,9 @@ struct vfio_iommu_driver_ops {
extern void vfio_unregister_iommu_driver(
const struct vfio_iommu_driver_ops *ops);
+extern int vfio_register_bus_driver(struct device_driver *drv);
+extern void vfio_unregister_bus_driver(struct device_driver *drv);
+
/*
* External user API
*/
Hook into vfio bus driver register/unregister support.
Signed-off-by: Alex Williamson <[email protected]>
Reviewed-by: Eric Auger <[email protected]>
Cc: Baptiste Reynal <[email protected]>
Cc: Kirti Wankhede <[email protected]>
---
drivers/vfio/mdev/vfio_mdev.c | 13 ++++++++++++-
drivers/vfio/pci/vfio_pci.c | 7 +++++++
drivers/vfio/platform/vfio_amba.c | 24 +++++++++++++++++++++++-
drivers/vfio/platform/vfio_platform.c | 24 +++++++++++++++++++++++-
4 files changed, 65 insertions(+), 3 deletions(-)
diff --git a/drivers/vfio/mdev/vfio_mdev.c b/drivers/vfio/mdev/vfio_mdev.c
index fa848a701b8b..b73d6f3e8ad5 100644
--- a/drivers/vfio/mdev/vfio_mdev.c
+++ b/drivers/vfio/mdev/vfio_mdev.c
@@ -131,11 +131,22 @@ struct mdev_driver vfio_mdev_driver = {
static int __init vfio_mdev_init(void)
{
- return mdev_register_driver(&vfio_mdev_driver, THIS_MODULE);
+ int ret;
+
+ ret = mdev_register_driver(&vfio_mdev_driver, THIS_MODULE);
+ if (ret)
+ return ret;
+
+ ret = vfio_register_bus_driver(&vfio_mdev_driver.driver);
+ if (ret)
+ mdev_unregister_driver(&vfio_mdev_driver);
+
+ return ret;
}
static void __exit vfio_mdev_exit(void)
{
+ vfio_unregister_bus_driver(&vfio_mdev_driver.driver);
mdev_unregister_driver(&vfio_mdev_driver);
}
diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index 063c1ce6fa42..b9ac5b8c53e6 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -1397,6 +1397,7 @@ static void vfio_pci_try_bus_reset(struct vfio_pci_device *vdev)
static void __exit vfio_pci_cleanup(void)
{
+ vfio_unregister_bus_driver(&vfio_pci_driver.driver);
pci_unregister_driver(&vfio_pci_driver);
vfio_pci_uninit_perm_bits();
}
@@ -1456,6 +1457,12 @@ static int __init vfio_pci_init(void)
if (ret)
goto out_driver;
+ ret = vfio_register_bus_driver(&vfio_pci_driver.driver);
+ if (ret) {
+ pci_unregister_driver(&vfio_pci_driver);
+ goto out_driver;
+ }
+
vfio_pci_fill_ids();
return 0;
diff --git a/drivers/vfio/platform/vfio_amba.c b/drivers/vfio/platform/vfio_amba.c
index 31372fbf6c5b..7fd9cb4a6756 100644
--- a/drivers/vfio/platform/vfio_amba.c
+++ b/drivers/vfio/platform/vfio_amba.c
@@ -109,7 +109,29 @@ static int vfio_amba_remove(struct amba_device *adev)
},
};
-module_amba_driver(vfio_amba_driver);
+static void __exit vfio_amba_exit(void)
+{
+ vfio_unregister_bus_driver(&vfio_amba_driver.drv);
+ amba_driver_unregister(&vfio_amba_driver);
+}
+
+static int __init vfio_amba_init(void)
+{
+ int ret;
+
+ ret = amba_driver_register(&vfio_amba_driver);
+ if (ret)
+ return ret;
+
+ ret = vfio_register_bus_driver(&vfio_amba_driver.drv);
+ if (ret)
+ amba_driver_unregister(&vfio_amba_driver);
+
+ return ret;
+}
+
+module_init(vfio_amba_init);
+module_exit(vfio_amba_exit);
MODULE_VERSION(DRIVER_VERSION);
MODULE_LICENSE("GPL v2");
diff --git a/drivers/vfio/platform/vfio_platform.c b/drivers/vfio/platform/vfio_platform.c
index 6561751a1063..3974dc65e6dc 100644
--- a/drivers/vfio/platform/vfio_platform.c
+++ b/drivers/vfio/platform/vfio_platform.c
@@ -100,7 +100,29 @@ static int vfio_platform_remove(struct platform_device *pdev)
},
};
-module_platform_driver(vfio_platform_driver);
+static void __exit vfio_platform_exit(void)
+{
+ vfio_unregister_bus_driver(&vfio_platform_driver.driver);
+ platform_driver_unregister(&vfio_platform_driver);
+}
+
+static int __init vfio_platform_init(void)
+{
+ int ret;
+
+ ret = platform_driver_register(&vfio_platform_driver);
+ if (ret)
+ return ret;
+
+ ret = vfio_register_bus_driver(&vfio_platform_driver.driver);
+ if (ret)
+ platform_driver_unregister(&vfio_platform_driver);
+
+ return ret;
+}
+
+module_init(vfio_platform_init);
+module_exit(vfio_platform_exit);
MODULE_VERSION(DRIVER_VERSION);
MODULE_LICENSE("GPL v2");
The driver core supports a BUS_NOTIFY_DRIVER_NOT_BOUND notification
sent if a driver fails to bind to a device. Extend IOMMU group
notifications to include a version of this.
Signed-off-by: Alex Williamson <[email protected]>
Acked-by: Joerg Roedel <[email protected]>
Reviewed-by: Eric Auger <[email protected]>
---
drivers/iommu/iommu.c | 3 +++
include/linux/iommu.h | 1 +
2 files changed, 4 insertions(+)
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index cf7ca7e70777..1a59b3626ab2 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1139,6 +1139,9 @@ static int iommu_bus_notifier(struct notifier_block *nb,
case BUS_NOTIFY_UNBOUND_DRIVER:
group_action = IOMMU_GROUP_NOTIFY_UNBOUND_DRIVER;
break;
+ case BUS_NOTIFY_DRIVER_NOT_BOUND:
+ group_action = IOMMU_GROUP_NOTIFY_DRIVER_NOT_BOUND;
+ break;
}
if (group_action)
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 2cb54adc4a33..5f6b01398bc1 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -271,6 +271,7 @@ static inline void iommu_device_set_fwnode(struct iommu_device *iommu,
#define IOMMU_GROUP_NOTIFY_BOUND_DRIVER 4 /* Post Driver bind */
#define IOMMU_GROUP_NOTIFY_UNBIND_DRIVER 5 /* Pre Driver unbind */
#define IOMMU_GROUP_NOTIFY_UNBOUND_DRIVER 6 /* Post Driver unbind */
+#define IOMMU_GROUP_NOTIFY_DRIVER_NOT_BOUND 7 /* Driver bind failed */
extern int bus_set_iommu(struct bus_type *bus, const struct iommu_ops *ops);
extern bool iommu_present(struct bus_type *bus);
If a device is bound to a non-vfio, non-whitelisted driver while a
group is in use, then the integrity of the group is compromised and
will result in hitting a BUG_ON. This code tries to avoid this case
by mangling driver_override to force a no-match for the driver. The
driver-core will either follow-up with a DRIVER_NOT_BOUND (preferred)
or BOUND_DRIVER, at which point we can remove the driver_override
mangling.
A complication here is that even though the window between these
notifications is expected to be extremely small, the vfio group could
be removed, which would prevent us from finding the group again to
remove the driver_override. We therefore take a group reference when
adding to driver_override and release it when removed. A second
complication is that driver_override can be modified by the system
admin through sysfs. To avoid trivial interference, we add a non-
user-visible UUID to the group and use this as part of the mangle
string.
The above blocks binding to a driver that would compromise the host,
but we'd also like to avoid reaching that step if possible. For this
we add a wait_event_timeout() with a short, 1 second timeout, which is
highly effective in allowing empty groups to finish cleanup.
Signed-off-by: Alex Williamson <[email protected]>
Reviewed-by: Eric Auger <[email protected]>
Tested-by: Eric Auger <[email protected]>
---
drivers/vfio/vfio.c | 145 +++++++++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 138 insertions(+), 7 deletions(-)
diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index f40d1508d368..20e57fecf652 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -25,6 +25,7 @@
#include <linux/miscdevice.h>
#include <linux/module.h>
#include <linux/mutex.h>
+#include <linux/platform_device.h>
#include <linux/pci.h>
#include <linux/rwsem.h>
#include <linux/sched.h>
@@ -32,6 +33,7 @@
#include <linux/stat.h>
#include <linux/string.h>
#include <linux/uaccess.h>
+#include <linux/uuid.h>
#include <linux/vfio.h>
#include <linux/wait.h>
@@ -95,6 +97,7 @@ struct vfio_group {
bool noiommu;
struct kvm *kvm;
struct blocking_notifier_head notifier;
+ unsigned char uuid[16];
};
struct vfio_device {
@@ -352,6 +355,8 @@ static struct vfio_group *vfio_create_group(struct iommu_group *iommu_group)
group->nb.notifier_call = vfio_iommu_group_notifier;
+ generate_random_uuid(group->uuid);
+
/*
* blocking notifiers acquire a rwsem around registering and hold
* it around callback. Therefore, need to register outside of
@@ -728,6 +733,111 @@ static int vfio_group_nb_verify(struct vfio_group *group, struct device *dev)
return vfio_dev_viable(dev, group);
}
+#define VFIO_TAG_PREFIX "#vfio_group:"
+
+static char **vfio_find_driver_override(struct device *dev)
+{
+ if (dev_is_pci(dev)) {
+ struct pci_dev *pdev = to_pci_dev(dev);
+ return &pdev->driver_override;
+ } else if (dev->bus == &platform_bus_type) {
+ struct platform_device *pdev = to_platform_device(dev);
+ return &pdev->driver_override;
+ }
+
+ return NULL;
+}
+
+/*
+ * If we're about to bind to something other than a known whitelisted driver
+ * or known vfio bus driver, try to avert it with driver_override.
+ */
+static void vfio_group_nb_pre_bind(struct vfio_group *group, struct device *dev)
+{
+ struct vfio_bus_driver *driver;
+ struct device_driver *drv = ACCESS_ONCE(dev->driver);
+ char **driver_override;
+
+ if (vfio_dev_whitelisted(dev, drv))
+ return; /* Binding to known "innocuous" device/driver */
+
+ mutex_lock(&vfio.bus_drivers_lock);
+ list_for_each_entry(driver, &vfio.bus_drivers_list, vfio_next) {
+ if (driver->drv == drv) {
+ mutex_unlock(&vfio.bus_drivers_lock);
+ return; /* Binding to known vfio bus driver, ok */
+ }
+ }
+ mutex_unlock(&vfio.bus_drivers_lock);
+
+ /* Can we stall slightly to let users fall off? */
+ if (list_empty(&group->device_list)) {
+ if (wait_event_timeout(vfio.release_q,
+ !atomic_read(&group->container_users), HZ))
+ return;
+ }
+
+ driver_override = vfio_find_driver_override(dev);
+ if (driver_override) {
+ char tag[50], *new = NULL, *old = *driver_override;
+
+ snprintf(tag, sizeof(tag), "%s%pU",
+ VFIO_TAG_PREFIX, group->uuid);
+
+ if (old && strstr(old, tag))
+ return; /* block already in place */
+
+ new = kasprintf(GFP_KERNEL, "%s%s", old ? old : "", tag);
+ if (new) {
+ *driver_override = new;
+ kfree(old);
+ vfio_group_get(group);
+ dev_warn(dev, "vfio: Blocking unsafe driver bind\n");
+ return;
+ }
+ }
+
+ dev_warn(dev, "vfio: Unsafe driver binding to in-use group!\n");
+}
+
+/* If we've mangled driver_override, remove it */
+static void vfio_group_nb_post_bind(struct vfio_group *group,
+ struct device *dev)
+{
+ char **driver_override = vfio_find_driver_override(dev);
+
+ if (driver_override && *driver_override) {
+ char tag[50], *new, *start, *end, *old = *driver_override;
+
+ snprintf(tag, sizeof(tag), "%s%pU",
+ VFIO_TAG_PREFIX, group->uuid);
+
+ start = strstr(old, tag);
+ if (start) {
+ end = start + strlen(tag);
+
+ if (old + strlen(old) > end)
+ memmove(start, end,
+ strlen(old) - (end - old) + 1);
+ else
+ *start = 0;
+
+ if (strlen(old)) {
+ new = kasprintf(GFP_KERNEL, "%s", old);
+ if (new) {
+ *driver_override = new;
+ kfree(old);
+ } /* else, in-place terminated, ok */
+ } else {
+ *driver_override = NULL;
+ kfree(old);
+ }
+
+ vfio_group_put(group);
+ }
+ }
+}
+
static int vfio_iommu_group_notifier(struct notifier_block *nb,
unsigned long action, void *data)
{
@@ -757,14 +867,23 @@ static int vfio_iommu_group_notifier(struct notifier_block *nb,
*/
break;
case IOMMU_GROUP_NOTIFY_BIND_DRIVER:
- pr_debug("%s: Device %s, group %d binding to driver\n",
+ pr_debug("%s: Device %s, group %d binding to driver %s\n",
__func__, dev_name(dev),
- iommu_group_id(group->iommu_group));
+ iommu_group_id(group->iommu_group), dev->driver->name);
+ if (vfio_group_nb_verify(group, dev))
+ vfio_group_nb_pre_bind(group, dev);
+ break;
+ case IOMMU_GROUP_NOTIFY_DRIVER_NOT_BOUND:
+ pr_debug("%s: Device %s, group %d binding fail to driver %s\n",
+ __func__, dev_name(dev),
+ iommu_group_id(group->iommu_group), dev->driver->name);
+ vfio_group_nb_post_bind(group, dev);
break;
case IOMMU_GROUP_NOTIFY_BOUND_DRIVER:
pr_debug("%s: Device %s, group %d bound to driver %s\n",
__func__, dev_name(dev),
iommu_group_id(group->iommu_group), dev->driver->name);
+ vfio_group_nb_post_bind(group, dev);
BUG_ON(vfio_group_nb_verify(group, dev));
break;
case IOMMU_GROUP_NOTIFY_UNBIND_DRIVER:
@@ -1351,6 +1470,7 @@ static int vfio_group_unset_container(struct vfio_group *group)
if (users != 1)
return -EBUSY;
+ wake_up(&vfio.release_q);
__vfio_group_unset_container(group);
return 0;
@@ -1364,7 +1484,11 @@ static int vfio_group_unset_container(struct vfio_group *group)
*/
static void vfio_group_try_dissolve_container(struct vfio_group *group)
{
- if (0 == atomic_dec_if_positive(&group->container_users))
+ int users = atomic_dec_if_positive(&group->container_users);
+
+ wake_up(&vfio.release_q);
+
+ if (!users)
__vfio_group_unset_container(group);
}
@@ -1433,19 +1557,26 @@ static bool vfio_group_viable(struct vfio_group *group)
static int vfio_group_add_container_user(struct vfio_group *group)
{
+ int ret;
+
if (!atomic_inc_not_zero(&group->container_users))
return -EINVAL;
if (group->noiommu) {
- atomic_dec(&group->container_users);
- return -EPERM;
+ ret = -EPERM;
+ goto out;
}
if (!group->container->iommu_driver || !vfio_group_viable(group)) {
- atomic_dec(&group->container_users);
- return -EINVAL;
+ ret = -EINVAL;
+ goto out;
}
return 0;
+
+out:
+ atomic_dec(&group->container_users);
+ wake_up(&vfio.release_q);
+ return ret;
}
static const struct file_operations vfio_device_fops;
This allows modules to match struct device.bus to amba_bustype for the
purpose of casting the device to an amba_device with to_amba_device().
Signed-off-by: Alex Williamson <[email protected]>
Reported-by: Eric Auger <[email protected]>
Cc: Russell King <[email protected]>
---
drivers/amba/bus.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/amba/bus.c b/drivers/amba/bus.c
index a56fa2a1e9aa..4e118cd3ddf3 100644
--- a/drivers/amba/bus.c
+++ b/drivers/amba/bus.c
@@ -197,6 +197,7 @@ struct bus_type amba_bustype = {
.uevent = amba_uevent,
.pm = &amba_pm,
};
+EXPORT_SYMBOL_GPL(amba_bustype);
static int __init amba_init(void)
{
AMBA also supports driver_override, but amba_bustype was not exported
to be able to identify an amba device.
Signed-off-by: Alex Williamson <[email protected]>
---
drivers/vfio/vfio.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 20e57fecf652..36f0fcfded0b 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -36,6 +36,7 @@
#include <linux/uuid.h>
#include <linux/vfio.h>
#include <linux/wait.h>
+#include <linux/amba/bus.h>
#define DRIVER_VERSION "0.3"
#define DRIVER_AUTHOR "Alex Williamson <[email protected]>"
@@ -743,6 +744,11 @@ static char **vfio_find_driver_override(struct device *dev)
} else if (dev->bus == &platform_bus_type) {
struct platform_device *pdev = to_platform_device(dev);
return &pdev->driver_override;
+#ifdef CONFIG_ARM_AMBA
+ } else if (dev->bus == &amba_bustype) {
+ struct amba_device *adev = to_amba_device(dev);
+ return &adev->driver_override;
+#endif
}
return NULL;
This patch on its own doesn't make much sense to me... any chance of
seeing the full series please?
Thanks.
On Mon, Jun 19, 2017 at 11:15:29AM -0600, Alex Williamson wrote:
> This allows modules to match struct device.bus to amba_bustype for the
> purpose of casting the device to an amba_device with to_amba_device().
>
> Signed-off-by: Alex Williamson <[email protected]>
> Reported-by: Eric Auger <[email protected]>
> Cc: Russell King <[email protected]>
> ---
> drivers/amba/bus.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/drivers/amba/bus.c b/drivers/amba/bus.c
> index a56fa2a1e9aa..4e118cd3ddf3 100644
> --- a/drivers/amba/bus.c
> +++ b/drivers/amba/bus.c
> @@ -197,6 +197,7 @@ struct bus_type amba_bustype = {
> .uevent = amba_uevent,
> .pm = &amba_pm,
> };
> +EXPORT_SYMBOL_GPL(amba_bustype);
>
> static int __init amba_init(void)
> {
>
--
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
On Mon, 19 Jun 2017 18:31:10 +0100
Russell King - ARM Linux <[email protected]> wrote:
> This patch on its own doesn't make much sense to me... any chance of
> seeing the full series please?
Hi Russell,
Please find it here:
https://lkml.org/lkml/2017/6/19/808
Patches 7 and 9:
https://lkml.org/lkml/2017/6/19/813
https://lkml.org/lkml/2017/6/19/812
Are particularly relevant. The gist of it is that we want to get to
the driver_override field of the device in order to force a no-match
for a driver bind when we're in a situation where binding to a host
driver could compromise the system integrity (user owned devices and
host owned devices in the same iommu group). driver_override is handled
in a common way, but is not part of struct device, it's part of the
containing structure, so we identify the bustype and therefore device
container to get to the override. The bustype is typically exported
and some bus drivers like PCI even offer convenient helpers like
dev_is_pci() to facilitate this sort of matching. Thanks,
Alex
> On Mon, Jun 19, 2017 at 11:15:29AM -0600, Alex Williamson wrote:
> > This allows modules to match struct device.bus to amba_bustype for the
> > purpose of casting the device to an amba_device with to_amba_device().
> >
> > Signed-off-by: Alex Williamson <[email protected]>
> > Reported-by: Eric Auger <[email protected]>
> > Cc: Russell King <[email protected]>
> > ---
> > drivers/amba/bus.c | 1 +
> > 1 file changed, 1 insertion(+)
> >
> > diff --git a/drivers/amba/bus.c b/drivers/amba/bus.c
> > index a56fa2a1e9aa..4e118cd3ddf3 100644
> > --- a/drivers/amba/bus.c
> > +++ b/drivers/amba/bus.c
> > @@ -197,6 +197,7 @@ struct bus_type amba_bustype = {
> > .uevent = amba_uevent,
> > .pm = &amba_pm,
> > };
> > +EXPORT_SYMBOL_GPL(amba_bustype);
> >
> > static int __init amba_init(void)
> > {
> >
>
On 20/06/17 03:14, Alex Williamson wrote:
> Unset-KVM and decrement-assignment only when we find the group in our
> list. Otherwise we can get out of sync if the user triggers this for
> groups that aren't currently on our list.
>
> Signed-off-by: Alex Williamson <[email protected]>
> Cc: Paolo Bonzini <[email protected]>
> Cc: Eric Auger <[email protected]>
> Cc: Alexey Kardashevskiy <[email protected]>
> Cc: [email protected]
> ---
> virt/kvm/vfio.c | 12 +++++-------
> 1 file changed, 5 insertions(+), 7 deletions(-)
>
> diff --git a/virt/kvm/vfio.c b/virt/kvm/vfio.c
> index 37d9118fd84b..f1b0b7bca9a9 100644
> --- a/virt/kvm/vfio.c
> +++ b/virt/kvm/vfio.c
> @@ -246,21 +246,19 @@ static int kvm_vfio_set_group(struct kvm_device *dev, long attr, u64 arg)
> continue;
>
> list_del(&kvg->node);
> + kvm_arch_end_assignment(dev->kvm);
> +#ifdef CONFIG_SPAPR_TCE_IOMMU
> + kvm_spapr_tce_release_vfio_group(dev->kvm, vfio_group);
> +#endif
> + kvm_vfio_group_set_kvm(kvg->vfio_group, NULL);
> kvm_vfio_group_put_external_user(kvg->vfio_group);
> kfree(kvg);
> ret = 0;
> break;
> }
>
> - kvm_arch_end_assignment(dev->kvm);
> -
> mutex_unlock(&kv->lock);
>
> -#ifdef CONFIG_SPAPR_TCE_IOMMU
> - kvm_spapr_tce_release_vfio_group(dev->kvm, vfio_group);
> -#endif
> - kvm_vfio_group_set_kvm(vfio_group, NULL);
Tiny nit: vfio_group becomes kvg->vfio_group in kvm_vfio_group_set_kvm()
and does not in kvm_spapr_tce_release_vfio_group().
Anyway,
Reviewed-by: Alexey Kardashevskiy <[email protected]>
> -
> kvm_vfio_group_put_external_user(vfio_group);
>
> kvm_vfio_update_coherency(dev);
>
--
Alexey
On Tue, 20 Jun 2017 12:34:57 +1000
Alexey Kardashevskiy <[email protected]> wrote:
> On 20/06/17 03:14, Alex Williamson wrote:
> > Unset-KVM and decrement-assignment only when we find the group in our
> > list. Otherwise we can get out of sync if the user triggers this for
> > groups that aren't currently on our list.
> >
> > Signed-off-by: Alex Williamson <[email protected]>
> > Cc: Paolo Bonzini <[email protected]>
> > Cc: Eric Auger <[email protected]>
> > Cc: Alexey Kardashevskiy <[email protected]>
> > Cc: [email protected]
> > ---
> > virt/kvm/vfio.c | 12 +++++-------
> > 1 file changed, 5 insertions(+), 7 deletions(-)
> >
> > diff --git a/virt/kvm/vfio.c b/virt/kvm/vfio.c
> > index 37d9118fd84b..f1b0b7bca9a9 100644
> > --- a/virt/kvm/vfio.c
> > +++ b/virt/kvm/vfio.c
> > @@ -246,21 +246,19 @@ static int kvm_vfio_set_group(struct kvm_device *dev, long attr, u64 arg)
> > continue;
> >
> > list_del(&kvg->node);
> > + kvm_arch_end_assignment(dev->kvm);
> > +#ifdef CONFIG_SPAPR_TCE_IOMMU
> > + kvm_spapr_tce_release_vfio_group(dev->kvm, vfio_group);
> > +#endif
> > + kvm_vfio_group_set_kvm(kvg->vfio_group, NULL);
> > kvm_vfio_group_put_external_user(kvg->vfio_group);
> > kfree(kvg);
> > ret = 0;
> > break;
> > }
> >
> > - kvm_arch_end_assignment(dev->kvm);
> > -
> > mutex_unlock(&kv->lock);
> >
> > -#ifdef CONFIG_SPAPR_TCE_IOMMU
> > - kvm_spapr_tce_release_vfio_group(dev->kvm, vfio_group);
> > -#endif
> > - kvm_vfio_group_set_kvm(vfio_group, NULL);
>
>
> Tiny nit: vfio_group becomes kvg->vfio_group in kvm_vfio_group_set_kvm()
> and does not in kvm_spapr_tce_release_vfio_group().
>
>
> Anyway,
>
> Reviewed-by: Alexey Kardashevskiy <[email protected]>
Thanks, I made the following change for consistency:
diff --git a/virt/kvm/vfio.c b/virt/kvm/vfio.c
index f1b0b7bca9a9..6e002d0f3191 100644
--- a/virt/kvm/vfio.c
+++ b/virt/kvm/vfio.c
@@ -248,7 +248,8 @@ static int kvm_vfio_set_group(struct kvm_device *dev, long attr, u64 arg)
list_del(&kvg->node);
kvm_arch_end_assignment(dev->kvm);
#ifdef CONFIG_SPAPR_TCE_IOMMU
- kvm_spapr_tce_release_vfio_group(dev->kvm, vfio_group);
+ kvm_spapr_tce_release_vfio_group(dev->kvm,
+ kvg->vfio_group);
#endif
kvm_vfio_group_set_kvm(kvg->vfio_group, NULL);
kvm_vfio_group_put_external_user(kvg->vfio_group);
Hi Alex,
[auto build test ERROR on vfio/next]
[also build test ERROR on v4.12-rc6 next-20170619]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
url: https://github.com/0day-ci/linux/commits/Alex-Williamson/vfio-Fix-release-ordering-races-and-use-driver_override/20170620-095741
base: https://github.com/awilliam/linux-vfio.git next
config: powerpc-defconfig (attached as .config)
compiler: powerpc64-linux-gnu-gcc (Debian 6.1.1-9) 6.1.1 20160705
reproduce:
wget https://raw.githubusercontent.com/01org/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=powerpc
Note: it may well be a FALSE warning. FWIW you are at least aware of it now.
http://gcc.gnu.org/wiki/Better_Uninitialized_Warnings
All errors (new ones prefixed by >>):
arch/powerpc/kvm/../../../virt/kvm/vfio.c: In function 'kvm_vfio_set_attr':
>> arch/powerpc/kvm/../../../virt/kvm/vfio.c:262:4: error: 'vfio_group' may be used uninitialized in this function [-Werror=maybe-uninitialized]
kvm_spapr_tce_release_vfio_group(dev->kvm, vfio_group);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
arch/powerpc/kvm/../../../virt/kvm/vfio.c:190:21: note: 'vfio_group' was declared here
struct vfio_group *vfio_group;
^~~~~~~~~~
cc1: all warnings being treated as errors
vim +/vfio_group +262 arch/powerpc/kvm/../../../virt/kvm/vfio.c
600c6bde Alex Williamson 2017-06-19 256 f.file))
ec53500f Alex Williamson 2013-10-30 257 continue;
ec53500f Alex Williamson 2013-10-30 258
ec53500f Alex Williamson 2013-10-30 259 list_del(&kvg->node);
14979b3f Alex Williamson 2017-06-19 260 kvm_arch_end_assignment(dev->kvm);
14979b3f Alex Williamson 2017-06-19 261 #ifdef CONFIG_SPAPR_TCE_IOMMU
14979b3f Alex Williamson 2017-06-19 @262 kvm_spapr_tce_release_vfio_group(dev->kvm, vfio_group);
14979b3f Alex Williamson 2017-06-19 263 #endif
14979b3f Alex Williamson 2017-06-19 264 kvm_vfio_group_set_kvm(kvg->vfio_group, NULL);
ec53500f Alex Williamson 2013-10-30 265 kvm_vfio_group_put_external_user(kvg->vfio_group);
:::::: The code at line 262 was first introduced by commit
:::::: 14979b3f26fbbe87d4240e463db53e64dd127184 kvm-vfio: Decouple only when we match a group
:::::: TO: Alex Williamson <[email protected]>
:::::: CC: 0day robot <[email protected]>
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation