Hi folks,
The iommu group is the minimal isolation boundary for DMA. Devices in
a group can access each other's MMIO registers via peer to peer DMA
and also need share the same I/O address space.
Once the I/O address space is assigned to user control it is no longer
available to the dma_map* API, which effectively makes the DMA API
non-working.
Second, userspace can use DMA initiated by a device that it controls
to access the MMIO spaces of other devices in the group. This allows
userspace to indirectly attack any kernel owned device and it's driver.
Therefore groups must either be entirely under kernel control or
userspace control, never a mixture. Unfortunately some systems have
problems with the granularity of groups and there are a couple of
important exceptions:
- pci_stub allows the admin to block driver binding on a device and
make it permanently shared with userspace. Since PCI stub does not
do DMA it is safe, however the admin must understand that using
pci_stub allows userspace to attack whatever device it was bound
it.
- PCI bridges are sometimes included in groups. Typically PCI bridges
do not use DMA, and generally do not have MMIO regions.
Generally any device that does not have any MMIO registers is a
possible candidate for an exception.
Currently vfio adopts a workaround to detect violations of the above
restrictions by monitoring the driver core BOUND event, and hardwiring
the above exceptions. Since there is no way for vfio to reject driver
binding at this point, BUG_ON() is triggered if a violation is
captured (kernel driver BOUND event on a group which already has some
devices assigned to userspace). Aside from the bad user experience
this opens a way for root userspace to crash the kernel, even in high
integrity configurations, by manipulating the module binding and
triggering the BUG_ON.
This series solves this problem by making the user/kernel ownership a
core concept at the IOMMU layer. The driver core enforces kernel
ownership while drivers are bound and violations now result in a error
codes during probe, not BUG_ON failures.
Patch partitions:
[PATCH 1-9]: Detect DMA ownership conflicts during driver binding;
[PATCH 10-13]: Add security context management for assigned devices;
[PATCH 14-18]: Various cleanups.
This is part one of three initial series for IOMMUFD:
* Move IOMMU Group security into the iommu layer
- Generic IOMMUFD implementation
- VFIO ability to consume IOMMUFD
Change log:
v1: initial post
- https://lore.kernel.org/linux-iommu/[email protected]/
v2:
- https://lore.kernel.org/linux-iommu/[email protected]/
- Move kernel dma ownership auto-claiming from driver core to bus
callback. [Greg/Christoph/Robin/Jason]
https://lore.kernel.org/linux-iommu/[email protected]/T/#m153706912b770682cb12e3c28f57e171aa1f9d0c
- Code and interface refactoring for iommu_set/release_dma_owner()
interfaces. [Jason]
https://lore.kernel.org/linux-iommu/[email protected]/T/#mea70ed8e4e3665aedf32a5a0a7db095bf680325e
- [NEW]Add new iommu_attach/detach_device_shared() interfaces for
multiple devices group. [Robin/Jason]
https://lore.kernel.org/linux-iommu/[email protected]/T/#mea70ed8e4e3665aedf32a5a0a7db095bf680325e
- [NEW]Use iommu_attach/detach_device_shared() in drm/tegra drivers.
- Refactoring and description refinement.
v3:
- Rename bus_type::dma_unconfigure to bus_type::dma_cleanup. [Greg]
https://lore.kernel.org/linux-iommu/[email protected]/T/#m6711e041e47cb0cbe3964fad0a3466f5ae4b3b9b
- Avoid _platform_dma_configure for platform_bus_type::dma_configure.
[Greg]
https://lore.kernel.org/linux-iommu/[email protected]/T/#m43fc46286611aa56a5c0eeaad99d539e5519f3f6
- Patch "0012-iommu-Add-iommu_at-de-tach_device_shared-for-mult.patch"
and "0018-drm-tegra-Use-the-iommu-dma_owner-mechanism.patch" have
been tested by Dmitry Osipenko <[email protected]>.
This is based on v5.16-rc3 and available on github:
https://github.com/LuBaolu/intel-iommu/commits/iommu-dma-ownership-v3
Best regards,
baolu
Jason Gunthorpe (2):
vfio: Delete the unbound_list
drm/tegra: Use the iommu dma_owner mechanism
Lu Baolu (16):
iommu: Add device dma ownership set/release interfaces
driver core: Add dma_cleanup callback in bus_type
driver core: platform: Rename platform_dma_configure()
driver core: platform: Add driver dma ownership management
amba: Add driver dma ownership management
bus: fsl-mc: Add driver dma ownership management
PCI: Add driver dma ownership management
PCI: pci_stub: Suppress kernel DMA ownership auto-claiming
PCI: portdrv: Suppress kernel DMA ownership auto-claiming
iommu: Add security context management for assigned devices
iommu: Expose group variants of dma ownership interfaces
iommu: Add iommu_at[de]tach_device_shared() for multi-device groups
vfio: Set DMA USER ownership for VFIO devices
vfio: Remove use of vfio_group_viable()
vfio: Remove iommu group notifier
iommu: Remove iommu group changes notifier
include/linux/amba/bus.h | 1 +
include/linux/device/bus.h | 3 +
include/linux/fsl/mc.h | 5 +
include/linux/iommu.h | 93 ++++++--
include/linux/pci.h | 5 +
include/linux/platform_device.h | 3 +-
drivers/amba/bus.c | 30 ++-
drivers/base/dd.c | 7 +-
drivers/base/platform.c | 31 ++-
drivers/bus/fsl-mc/fsl-mc-bus.c | 26 +-
drivers/gpu/drm/tegra/dc.c | 1 +
drivers/gpu/drm/tegra/drm.c | 55 +++--
drivers/gpu/drm/tegra/gr2d.c | 1 +
drivers/gpu/drm/tegra/gr3d.c | 1 +
drivers/gpu/drm/tegra/vic.c | 1 +
drivers/iommu/iommu.c | 329 ++++++++++++++++++++------
drivers/pci/pci-driver.c | 21 ++
drivers/pci/pci-stub.c | 1 +
drivers/pci/pcie/portdrv_pci.c | 2 +
drivers/vfio/fsl-mc/vfio_fsl_mc.c | 1 +
drivers/vfio/pci/vfio_pci.c | 1 +
drivers/vfio/platform/vfio_amba.c | 1 +
drivers/vfio/platform/vfio_platform.c | 1 +
drivers/vfio/vfio.c | 248 ++-----------------
24 files changed, 502 insertions(+), 366 deletions(-)
--
2.25.1
From the perspective of who is initiating the device to do DMA, device
DMA could be divided into the following types:
DMA_OWNER_DMA_API: Device DMAs are initiated by a kernel driver
through the kernel DMA API.
DMA_OWNER_PRIVATE_DOMAIN: Device DMAs are initiated by a kernel
driver with its own PRIVATE domain.
DMA_OWNER_PRIVATE_DOMAIN_USER: Device DMAs are initiated by
userspace.
Different DMA ownerships are exclusive for all devices in the same iommu
group as an iommu group is the smallest granularity of device isolation
and protection that the IOMMU subsystem can guarantee. This extends the
iommu core to enforce this exclusion.
Basically two new interfaces are provided:
int iommu_device_set_dma_owner(struct device *dev,
enum iommu_dma_owner type, void *owner_cookie);
void iommu_device_release_dma_owner(struct device *dev,
enum iommu_dma_owner type);
Although above interfaces are per-device, DMA owner is tracked per group
under the hood. An iommu group cannot have different dma ownership set
at the same time. Violation of this assumption fails
iommu_device_set_dma_owner().
Kernel driver which does DMA have DMA_OWNER_DMA_API automatically set/
released in the driver binding/unbinding process (see next patch).
Kernel driver which doesn't do DMA could avoid setting the owner type.
Device bound to such driver is considered same as a driver-less device
which is compatible to all owner types.
Userspace driver framework (e.g. vfio) should set
DMA_OWNER_PRIVATE_DOMAIN_USER for a device before the userspace is allowed
to access it, plus a owner cookie pointer to mark the user identity so a
single group cannot be operated by multiple users simultaneously. Vice
versa, the owner type should be released after the user access permission
is withdrawn.
Signed-off-by: Jason Gunthorpe <[email protected]>
Signed-off-by: Kevin Tian <[email protected]>
Signed-off-by: Lu Baolu <[email protected]>
---
include/linux/iommu.h | 36 +++++++++++++++++
drivers/iommu/iommu.c | 93 +++++++++++++++++++++++++++++++++++++++++++
2 files changed, 129 insertions(+)
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index d2f3435e7d17..24676b498f38 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -162,6 +162,23 @@ enum iommu_dev_features {
IOMMU_DEV_FEAT_IOPF,
};
+/**
+ * enum iommu_dma_owner - IOMMU DMA ownership
+ * @DMA_OWNER_NONE: No DMA ownership.
+ * @DMA_OWNER_DMA_API: Device DMAs are initiated by a kernel driver through
+ * the kernel DMA API.
+ * @DMA_OWNER_PRIVATE_DOMAIN: Device DMAs are initiated by a kernel driver
+ * which provides an UNMANAGED domain.
+ * @DMA_OWNER_PRIVATE_DOMAIN_USER: Device DMAs are initiated by userspace,
+ * kernel ensures that DMAs never go to kernel memory.
+ */
+enum iommu_dma_owner {
+ DMA_OWNER_NONE,
+ DMA_OWNER_DMA_API,
+ DMA_OWNER_PRIVATE_DOMAIN,
+ DMA_OWNER_PRIVATE_DOMAIN_USER,
+};
+
#define IOMMU_PASID_INVALID (-1U)
#ifdef CONFIG_IOMMU_API
@@ -681,6 +698,10 @@ struct iommu_sva *iommu_sva_bind_device(struct device *dev,
void iommu_sva_unbind_device(struct iommu_sva *handle);
u32 iommu_sva_get_pasid(struct iommu_sva *handle);
+int iommu_device_set_dma_owner(struct device *dev, enum iommu_dma_owner owner,
+ void *owner_cookie);
+void iommu_device_release_dma_owner(struct device *dev, enum iommu_dma_owner owner);
+
#else /* CONFIG_IOMMU_API */
struct iommu_ops {};
@@ -1081,6 +1102,21 @@ static inline struct iommu_fwspec *dev_iommu_fwspec_get(struct device *dev)
{
return NULL;
}
+
+static inline int iommu_device_set_dma_owner(struct device *dev,
+ enum iommu_dma_owner owner,
+ void *owner_cookie)
+{
+ if (owner != DMA_OWNER_DMA_API)
+ return -EINVAL;
+
+ return 0;
+}
+
+static inline void iommu_device_release_dma_owner(struct device *dev,
+ enum iommu_dma_owner owner)
+{
+}
#endif /* CONFIG_IOMMU_API */
/**
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 8b86406b7162..1de520a07518 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -48,6 +48,9 @@ struct iommu_group {
struct iommu_domain *default_domain;
struct iommu_domain *domain;
struct list_head entry;
+ enum iommu_dma_owner dma_owner;
+ refcount_t owner_cnt;
+ void *owner_cookie;
};
struct group_device {
@@ -621,6 +624,7 @@ struct iommu_group *iommu_group_alloc(void)
INIT_LIST_HEAD(&group->devices);
INIT_LIST_HEAD(&group->entry);
BLOCKING_INIT_NOTIFIER_HEAD(&group->notifier);
+ group->dma_owner = DMA_OWNER_NONE;
ret = ida_simple_get(&iommu_group_ida, 0, 0, GFP_KERNEL);
if (ret < 0) {
@@ -3351,3 +3355,92 @@ static ssize_t iommu_group_store_type(struct iommu_group *group,
return ret;
}
+
+static int __iommu_group_set_dma_owner(struct iommu_group *group,
+ enum iommu_dma_owner owner,
+ void *owner_cookie)
+{
+ if (refcount_inc_not_zero(&group->owner_cnt)) {
+ if (group->dma_owner != owner ||
+ group->owner_cookie != owner_cookie) {
+ refcount_dec(&group->owner_cnt);
+ return -EBUSY;
+ }
+
+ return 0;
+ }
+
+ group->dma_owner = owner;
+ group->owner_cookie = owner_cookie;
+ refcount_set(&group->owner_cnt, 1);
+
+ return 0;
+}
+
+static void __iommu_group_release_dma_owner(struct iommu_group *group,
+ enum iommu_dma_owner owner)
+{
+ if (WARN_ON(group->dma_owner != owner))
+ return;
+
+ if (!refcount_dec_and_test(&group->owner_cnt))
+ return;
+
+ group->dma_owner = DMA_OWNER_NONE;
+}
+
+/**
+ * iommu_device_set_dma_owner() - Set DMA ownership of a device
+ * @dev: The device.
+ * @owner: DMA ownership type.
+ * @owner_cookie: Caller specified pointer. Could be used for exclusive
+ * declaration. Could be NULL.
+ *
+ * Set the DMA ownership of a device. The different ownerships are
+ * exclusive. The caller could specify a owner_cookie pointer so that
+ * the same DMA ownership could be exclusive among different owners.
+ */
+int iommu_device_set_dma_owner(struct device *dev, enum iommu_dma_owner owner,
+ void *owner_cookie)
+{
+ struct iommu_group *group = iommu_group_get(dev);
+ int ret;
+
+ if (!group) {
+ if (owner == DMA_OWNER_DMA_API)
+ return 0;
+ else
+ return -ENODEV;
+ }
+
+ mutex_lock(&group->mutex);
+ ret = __iommu_group_set_dma_owner(group, owner, owner_cookie);
+ mutex_unlock(&group->mutex);
+ iommu_group_put(group);
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(iommu_device_set_dma_owner);
+
+/**
+ * iommu_device_release_dma_owner() - Release DMA ownership of a device
+ * @dev: The device.
+ * @owner: The DMA ownership type.
+ *
+ * Release the DMA ownership claimed by iommu_device_set_dma_owner().
+ */
+void iommu_device_release_dma_owner(struct device *dev, enum iommu_dma_owner owner)
+{
+ struct iommu_group *group = iommu_group_get(dev);
+
+ if (!group) {
+ WARN_ON(owner != DMA_OWNER_DMA_API);
+ return;
+ }
+
+ mutex_lock(&group->mutex);
+ __iommu_group_release_dma_owner(group, owner);
+ mutex_unlock(&group->mutex);
+ iommu_group_put(group);
+}
+EXPORT_SYMBOL_GPL(iommu_device_release_dma_owner);
--
2.25.1
The bus_type structure defines dma_configure() callback for bus drivers
to configure DMA on the devices. This adds the paired dma_cleanup()
callback and calls it during driver unbinding so that bus drivers can do
some cleanup work.
One use case for this paired DMA callbacks is for the bus driver to check
for DMA ownership conflicts during driver binding, where multiple devices
belonging to a same IOMMU group (the minimum granularity of isolation and
protection) may be assigned to kernel drivers or user space respectively.
Without this change, for example, the vfio driver has to listen to a bus
BOUND_DRIVER event and then BUG_ON() in case of dma ownership conflict.
This leads to bad user experience since careless driver binding operation
may crash the system if the admin overlooks the group restriction. Aside
from bad design, this leads to a security problem as a root user, even with
lockdown=integrity, can force the kernel to BUG.
With this change, the bus driver could check and set the DMA ownership in
driver binding process and fail on ownership conflicts. The DMA ownership
should be released during driver unbinding.
Suggested-by: Jason Gunthorpe <[email protected]>
Link: https://lore.kernel.org/linux-iommu/[email protected]/
Link: https://lore.kernel.org/linux-iommu/[email protected]/
Signed-off-by: Lu Baolu <[email protected]>
---
include/linux/device/bus.h | 3 +++
drivers/base/dd.c | 7 ++++++-
2 files changed, 9 insertions(+), 1 deletion(-)
diff --git a/include/linux/device/bus.h b/include/linux/device/bus.h
index a039ab809753..d8b29ccd07e5 100644
--- a/include/linux/device/bus.h
+++ b/include/linux/device/bus.h
@@ -59,6 +59,8 @@ struct fwnode_handle;
* bus supports.
* @dma_configure: Called to setup DMA configuration on a device on
* this bus.
+ * @dma_cleanup: Called to cleanup DMA configuration on a device on
+ * this bus.
* @pm: Power management operations of this bus, callback the specific
* device driver's pm-ops.
* @iommu_ops: IOMMU specific operations for this bus, used to attach IOMMU
@@ -103,6 +105,7 @@ struct bus_type {
int (*num_vf)(struct device *dev);
int (*dma_configure)(struct device *dev);
+ void (*dma_cleanup)(struct device *dev);
const struct dev_pm_ops *pm;
diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index 68ea1f949daa..ae457fa2bca6 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -577,7 +577,7 @@ static int really_probe(struct device *dev, struct device_driver *drv)
if (dev->bus->dma_configure) {
ret = dev->bus->dma_configure(dev);
if (ret)
- goto probe_failed;
+ goto pinctrl_bind_failed;
}
ret = driver_sysfs_add(dev);
@@ -660,6 +660,8 @@ static int really_probe(struct device *dev, struct device_driver *drv)
if (dev->bus)
blocking_notifier_call_chain(&dev->bus->p->bus_notifier,
BUS_NOTIFY_DRIVER_NOT_BOUND, dev);
+ if (dev->bus->dma_cleanup)
+ dev->bus->dma_cleanup(dev);
pinctrl_bind_failed:
device_links_no_driver(dev);
devres_release_all(dev);
@@ -1204,6 +1206,9 @@ static void __device_release_driver(struct device *dev, struct device *parent)
else if (drv->remove)
drv->remove(dev);
+ if (dev->bus->dma_cleanup)
+ dev->bus->dma_cleanup(dev);
+
device_links_driver_cleanup(dev);
devres_release_all(dev);
--
2.25.1
Multiple platform devices may be placed in the same IOMMU group because
they cannot be isolated from each other. These devices must either be
entirely under kernel control or userspace control, never a mixture. This
checks and sets DMA ownership during driver binding, and release the
ownership during driver unbinding.
Driver may set a new flag (suppress_auto_claim_dma_owner) to disable auto
claiming DMA_OWNER_DMA_API ownership in the binding process. For instance,
the userspace framework drivers (vfio etc.) which need to manually claim
DMA_OWNER_PRIVATE_DOMAIN_USER when assigning a device to userspace.
Signed-off-by: Lu Baolu <[email protected]>
---
include/linux/platform_device.h | 1 +
drivers/base/platform.c | 30 +++++++++++++++++++++++++++++-
2 files changed, 30 insertions(+), 1 deletion(-)
diff --git a/include/linux/platform_device.h b/include/linux/platform_device.h
index 4381c34af7e0..f3926be7582f 100644
--- a/include/linux/platform_device.h
+++ b/include/linux/platform_device.h
@@ -210,6 +210,7 @@ struct platform_driver {
struct device_driver driver;
const struct platform_device_id *id_table;
bool prevent_deferred_probe;
+ bool suppress_auto_claim_dma_owner;
};
#define to_platform_driver(drv) (container_of((drv), struct platform_driver, \
diff --git a/drivers/base/platform.c b/drivers/base/platform.c
index 125d708c0eb3..032b9f20b468 100644
--- a/drivers/base/platform.c
+++ b/drivers/base/platform.c
@@ -30,6 +30,7 @@
#include <linux/property.h>
#include <linux/kmemleak.h>
#include <linux/types.h>
+#include <linux/iommu.h>
#include "base.h"
#include "power/power.h"
@@ -1464,6 +1465,32 @@ int firmware_dma_configure(struct device *dev)
return ret;
}
+static int platform_dma_configure(struct device *dev)
+{
+ struct platform_driver *drv = to_platform_driver(dev->driver);
+ int ret;
+
+ if (!drv->suppress_auto_claim_dma_owner) {
+ ret = iommu_device_set_dma_owner(dev, DMA_OWNER_DMA_API, NULL);
+ if (ret)
+ return ret;
+ }
+
+ ret = firmware_dma_configure(dev);
+ if (ret && !drv->suppress_auto_claim_dma_owner)
+ iommu_device_release_dma_owner(dev, DMA_OWNER_DMA_API);
+
+ return ret;
+}
+
+static void platform_dma_cleanup(struct device *dev)
+{
+ struct platform_driver *drv = to_platform_driver(dev->driver);
+
+ if (!drv->suppress_auto_claim_dma_owner)
+ iommu_device_release_dma_owner(dev, DMA_OWNER_DMA_API);
+}
+
static const struct dev_pm_ops platform_dev_pm_ops = {
SET_RUNTIME_PM_OPS(pm_generic_runtime_suspend, pm_generic_runtime_resume, NULL)
USE_PLATFORM_PM_SLEEP_OPS
@@ -1477,7 +1504,8 @@ struct bus_type platform_bus_type = {
.probe = platform_probe,
.remove = platform_remove,
.shutdown = platform_shutdown,
- .dma_configure = firmware_dma_configure,
+ .dma_configure = platform_dma_configure,
+ .dma_cleanup = platform_dma_cleanup,
.pm = &platform_dev_pm_ops,
};
EXPORT_SYMBOL_GPL(platform_bus_type);
--
2.25.1
The platform_dma_configure() is shared between platform and amba bus
drivers. Rename the common helper to firmware_dma_configure() so that
both platform and amba bus drivers could customize their dma_configure
callbacks.
Signed-off-by: Lu Baolu <[email protected]>
---
include/linux/platform_device.h | 2 +-
drivers/amba/bus.c | 2 +-
drivers/base/platform.c | 5 ++---
3 files changed, 4 insertions(+), 5 deletions(-)
diff --git a/include/linux/platform_device.h b/include/linux/platform_device.h
index 7c96f169d274..4381c34af7e0 100644
--- a/include/linux/platform_device.h
+++ b/include/linux/platform_device.h
@@ -328,7 +328,7 @@ extern int platform_pm_restore(struct device *dev);
#define platform_pm_restore NULL
#endif
-extern int platform_dma_configure(struct device *dev);
+extern int firmware_dma_configure(struct device *dev);
#ifdef CONFIG_PM_SLEEP
#define USE_PLATFORM_PM_SLEEP_OPS \
diff --git a/drivers/amba/bus.c b/drivers/amba/bus.c
index 720aa6cdd402..08c094124c0e 100644
--- a/drivers/amba/bus.c
+++ b/drivers/amba/bus.c
@@ -319,7 +319,7 @@ struct bus_type amba_bustype = {
.probe = amba_probe,
.remove = amba_remove,
.shutdown = amba_shutdown,
- .dma_configure = platform_dma_configure,
+ .dma_configure = firmware_dma_configure,
.pm = &amba_pm,
};
EXPORT_SYMBOL_GPL(amba_bustype);
diff --git a/drivers/base/platform.c b/drivers/base/platform.c
index 598acf93a360..125d708c0eb3 100644
--- a/drivers/base/platform.c
+++ b/drivers/base/platform.c
@@ -1449,8 +1449,7 @@ static void platform_shutdown(struct device *_dev)
drv->shutdown(dev);
}
-
-int platform_dma_configure(struct device *dev)
+int firmware_dma_configure(struct device *dev)
{
enum dev_dma_attr attr;
int ret = 0;
@@ -1478,7 +1477,7 @@ struct bus_type platform_bus_type = {
.probe = platform_probe,
.remove = platform_remove,
.shutdown = platform_shutdown,
- .dma_configure = platform_dma_configure,
+ .dma_configure = firmware_dma_configure,
.pm = &platform_dev_pm_ops,
};
EXPORT_SYMBOL_GPL(platform_bus_type);
--
2.25.1
Multiple amba devices may be placed in the same IOMMU group because
they cannot be isolated from each other. These devices must either be
entirely under kernel control or userspace control, never a mixture. This
checks and sets DMA ownership during driver binding, and release the
ownership during driver unbinding.
Driver may set a new flag (suppress_auto_claim_dma_owner) to disable auto
claiming DMA_OWNER_DMA_API ownership in the binding process. For instance,
the userspace framework drivers (vfio etc.) which need to manually claim
DMA_OWNER_PRIVATE_DOMAIN_USER when assigning a device to userspace.
Signed-off-by: Lu Baolu <[email protected]>
---
include/linux/amba/bus.h | 1 +
drivers/amba/bus.c | 30 +++++++++++++++++++++++++++++-
2 files changed, 30 insertions(+), 1 deletion(-)
diff --git a/include/linux/amba/bus.h b/include/linux/amba/bus.h
index edfcf7a14dcd..745c5a60ddd8 100644
--- a/include/linux/amba/bus.h
+++ b/include/linux/amba/bus.h
@@ -79,6 +79,7 @@ struct amba_driver {
void (*remove)(struct amba_device *);
void (*shutdown)(struct amba_device *);
const struct amba_id *id_table;
+ bool suppress_auto_claim_dma_owner;
};
/*
diff --git a/drivers/amba/bus.c b/drivers/amba/bus.c
index 08c094124c0e..d220584c900b 100644
--- a/drivers/amba/bus.c
+++ b/drivers/amba/bus.c
@@ -20,6 +20,7 @@
#include <linux/platform_device.h>
#include <linux/reset.h>
#include <linux/of_irq.h>
+#include <linux/iommu.h>
#include <asm/irq.h>
@@ -305,6 +306,32 @@ static const struct dev_pm_ops amba_pm = {
)
};
+static int amba_dma_configure(struct device *dev)
+{
+ struct amba_driver *drv = to_amba_driver(dev->driver);
+ int ret;
+
+ if (!drv->suppress_auto_claim_dma_owner) {
+ ret = iommu_device_set_dma_owner(dev, DMA_OWNER_DMA_API, NULL);
+ if (ret)
+ return ret;
+ }
+
+ ret = firmware_dma_configure(dev);
+ if (ret && !drv->suppress_auto_claim_dma_owner)
+ iommu_device_release_dma_owner(dev, DMA_OWNER_DMA_API);
+
+ return ret;
+}
+
+static void amba_dma_cleanup(struct device *dev)
+{
+ struct amba_driver *drv = to_amba_driver(dev->driver);
+
+ if (!drv->suppress_auto_claim_dma_owner)
+ iommu_device_release_dma_owner(dev, DMA_OWNER_DMA_API);
+}
+
/*
* Primecells are part of the Advanced Microcontroller Bus Architecture,
* so we call the bus "amba".
@@ -319,7 +346,8 @@ struct bus_type amba_bustype = {
.probe = amba_probe,
.remove = amba_remove,
.shutdown = amba_shutdown,
- .dma_configure = firmware_dma_configure,
+ .dma_configure = amba_dma_configure,
+ .dma_cleanup = amba_dma_cleanup,
.pm = &amba_pm,
};
EXPORT_SYMBOL_GPL(amba_bustype);
--
2.25.1
Multiple fsl-mc devices may be placed in the same IOMMU group because
they cannot be isolated from each other. These devices must either be
entirely under kernel control or userspace control, never a mixture. This
checks and sets DMA ownership during driver binding, and release the
ownership during driver unbinding.
Driver may set a new flag (suppress_auto_claim_dma_owner) to disable auto
claiming DMA_OWNER_DMA_API ownership in the binding process. For instance,
the userspace framework drivers (vfio etc.) which need to manually claim
DMA_OWNER_PRIVATE_DOMAIN_USER when assigning a device to userspace.
Signed-off-by: Lu Baolu <[email protected]>
---
include/linux/fsl/mc.h | 5 +++++
drivers/bus/fsl-mc/fsl-mc-bus.c | 26 ++++++++++++++++++++++++--
2 files changed, 29 insertions(+), 2 deletions(-)
diff --git a/include/linux/fsl/mc.h b/include/linux/fsl/mc.h
index e026f6c48b49..3a26421ee3dc 100644
--- a/include/linux/fsl/mc.h
+++ b/include/linux/fsl/mc.h
@@ -32,6 +32,10 @@ struct fsl_mc_io;
* @shutdown: Function called at shutdown time to quiesce the device
* @suspend: Function called when a device is stopped
* @resume: Function called when a device is resumed
+ * @suppress_auto_claim_dma_owner: Disable auto claiming of kernel DMA owner.
+ * Drivers which don't require DMA or want to manually claim the
+ * owner type (e.g. userspace driver frameworks) could set this
+ * flag.
*
* Generic DPAA device driver object for device drivers that are registered
* with a DPRC bus. This structure is to be embedded in each device-specific
@@ -45,6 +49,7 @@ struct fsl_mc_driver {
void (*shutdown)(struct fsl_mc_device *dev);
int (*suspend)(struct fsl_mc_device *dev, pm_message_t state);
int (*resume)(struct fsl_mc_device *dev);
+ bool suppress_auto_claim_dma_owner;
};
#define to_fsl_mc_driver(_drv) \
diff --git a/drivers/bus/fsl-mc/fsl-mc-bus.c b/drivers/bus/fsl-mc/fsl-mc-bus.c
index 8fd4a356a86e..048a837a7116 100644
--- a/drivers/bus/fsl-mc/fsl-mc-bus.c
+++ b/drivers/bus/fsl-mc/fsl-mc-bus.c
@@ -140,15 +140,36 @@ static int fsl_mc_dma_configure(struct device *dev)
{
struct device *dma_dev = dev;
struct fsl_mc_device *mc_dev = to_fsl_mc_device(dev);
+ struct fsl_mc_driver *mc_drv = to_fsl_mc_driver(dev->driver);
u32 input_id = mc_dev->icid;
+ int ret;
+
+ if (!mc_drv->suppress_auto_claim_dma_owner) {
+ ret = iommu_device_set_dma_owner(dev, DMA_OWNER_DMA_API, NULL);
+ if (ret)
+ return ret;
+ }
while (dev_is_fsl_mc(dma_dev))
dma_dev = dma_dev->parent;
if (dev_of_node(dma_dev))
- return of_dma_configure_id(dev, dma_dev->of_node, 0, &input_id);
+ ret = of_dma_configure_id(dev, dma_dev->of_node, 0, &input_id);
+ else
+ ret = acpi_dma_configure_id(dev, DEV_DMA_COHERENT, &input_id);
+
+ if (ret && !mc_drv->suppress_auto_claim_dma_owner)
+ iommu_device_release_dma_owner(dev, DMA_OWNER_DMA_API);
+
+ return ret;
+}
+
+static void fsl_mc_dma_cleanup(struct device *dev)
+{
+ struct fsl_mc_driver *mc_drv = to_fsl_mc_driver(dev->driver);
- return acpi_dma_configure_id(dev, DEV_DMA_COHERENT, &input_id);
+ if (!mc_drv->suppress_auto_claim_dma_owner)
+ iommu_device_release_dma_owner(dev, DMA_OWNER_DMA_API);
}
static ssize_t modalias_show(struct device *dev, struct device_attribute *attr,
@@ -312,6 +333,7 @@ struct bus_type fsl_mc_bus_type = {
.match = fsl_mc_bus_match,
.uevent = fsl_mc_bus_uevent,
.dma_configure = fsl_mc_dma_configure,
+ .dma_cleanup = fsl_mc_dma_cleanup,
.dev_groups = fsl_mc_dev_groups,
.bus_groups = fsl_mc_bus_groups,
};
--
2.25.1
Multiple PCI devices may be placed in the same IOMMU group because they
cannot be isolated from each other. These devices must either be entirely
under kernel control or userspace control, never a mixture. This checks
and sets DMA ownership during driver binding, and vice versa, release the
ownership during driver unbinding.
Driver may set a new flag (suppress_auto_claim_dma_owner) to disable auto
claiming DMA_OWNER_DMA_API ownership in the binding process. For instance,
the userspace framework drivers (vfio etc.) which need to manually claim
DMA_OWNER_PRIVATE_DOMAIN_USER when assigning a device to userspace.
Signed-off-by: Lu Baolu <[email protected]>
---
include/linux/pci.h | 5 +++++
drivers/pci/pci-driver.c | 21 +++++++++++++++++++++
2 files changed, 26 insertions(+)
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 18a75c8e615c..1b29af0ab43b 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -882,6 +882,10 @@ struct module;
* created once it is bound to the driver.
* @driver: Driver model structure.
* @dynids: List of dynamically added device IDs.
+ * @suppress_auto_claim_dma_owner: Disable auto claiming of kernel DMA owner.
+ * Drivers which don't require DMA or want to manually claim the
+ * owner type (e.g. userspace driver frameworks) could set this
+ * flag.
*/
struct pci_driver {
struct list_head node;
@@ -900,6 +904,7 @@ struct pci_driver {
const struct attribute_group **dev_groups;
struct device_driver driver;
struct pci_dynids dynids;
+ bool suppress_auto_claim_dma_owner;
};
static inline struct pci_driver *to_pci_driver(struct device_driver *drv)
diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index 588588cfda48..92712645e179 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -20,6 +20,7 @@
#include <linux/of_device.h>
#include <linux/acpi.h>
#include <linux/dma-map-ops.h>
+#include <linux/iommu.h>
#include "pci.h"
#include "pcie/portdrv.h"
@@ -1590,9 +1591,16 @@ static int pci_bus_num_vf(struct device *dev)
*/
static int pci_dma_configure(struct device *dev)
{
+ struct pci_driver *driver = to_pci_driver(dev->driver);
struct device *bridge;
int ret = 0;
+ if (!driver->suppress_auto_claim_dma_owner) {
+ ret = iommu_device_set_dma_owner(dev, DMA_OWNER_DMA_API, NULL);
+ if (ret)
+ return ret;
+ }
+
bridge = pci_get_host_bridge_device(to_pci_dev(dev));
if (IS_ENABLED(CONFIG_OF) && bridge->parent &&
@@ -1605,9 +1613,21 @@ static int pci_dma_configure(struct device *dev)
}
pci_put_host_bridge_device(bridge);
+
+ if (ret && !driver->suppress_auto_claim_dma_owner)
+ iommu_device_release_dma_owner(dev, DMA_OWNER_DMA_API);
+
return ret;
}
+static void pci_dma_cleanup(struct device *dev)
+{
+ struct pci_driver *driver = to_pci_driver(dev->driver);
+
+ if (!driver->suppress_auto_claim_dma_owner)
+ iommu_device_release_dma_owner(dev, DMA_OWNER_DMA_API);
+}
+
struct bus_type pci_bus_type = {
.name = "pci",
.match = pci_bus_match,
@@ -1621,6 +1641,7 @@ struct bus_type pci_bus_type = {
.pm = PCI_PM_OPS_PTR,
.num_vf = pci_bus_num_vf,
.dma_configure = pci_dma_configure,
+ .dma_cleanup = pci_dma_cleanup,
};
EXPORT_SYMBOL(pci_bus_type);
--
2.25.1
The pci_dma_configure() marks the iommu_group as containing only devices
with kernel drivers that manage DMA. Avoid this default behavior for the
pci_stub because it does not program any DMA itself. This allows the
pci_stub still able to be used by the admin to block driver binding after
applying the DMA ownership to vfio.
Signed-off-by: Lu Baolu <[email protected]>
---
drivers/pci/pci-stub.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/pci/pci-stub.c b/drivers/pci/pci-stub.c
index e408099fea52..14b9b9f2ad2b 100644
--- a/drivers/pci/pci-stub.c
+++ b/drivers/pci/pci-stub.c
@@ -36,6 +36,7 @@ static struct pci_driver stub_driver = {
.name = "pci-stub",
.id_table = NULL, /* only dynamic id's */
.probe = pci_stub_probe,
+ .suppress_auto_claim_dma_owner = true,
};
static int __init pci_stub_init(void)
--
2.25.1
IOMMU grouping on PCI necessitates that if we lack isolation on a bridge
then all of the downstream devices will be part of the same IOMMU group
as the bridge. The existing vfio framework allows the portdrv driver to
be bound to the bridge while its downstream devices are assigned to user
space. The pci_dma_configure() marks the iommu_group as containing only
devices with kernel drivers that manage DMA. Avoid this default behavior
for the portdrv driver in order for compatibility with the current vfio
policy.
The commit 5f096b14d421b ("vfio: Whitelist PCI bridges") extended above
policy to all kernel drivers of bridge class. This is not always safe.
For example, The shpchp_core driver relies on the PCI MMIO access for the
controller functionality. With its downstream devices assigned to the
userspace, the MMIO might be changed through user initiated P2P accesses
without any notification. This might break the kernel driver integrity
and lead to some unpredictable consequences.
For any bridge driver, in order to avoiding default kernel DMA ownership
claiming, we should consider:
1) Does the bridge driver use DMA? Calling pci_set_master() or
a dma_map_* API is a sure indicate the driver is doing DMA
2) If the bridge driver uses MMIO, is it tolerant to hostile
userspace also touching the same MMIO registers via P2P DMA
attacks?
Conservatively if the driver maps an MMIO region at all, we can say that
it fails the test.
Suggested-by: Jason Gunthorpe <[email protected]>
Suggested-by: Kevin Tian <[email protected]>
Signed-off-by: Lu Baolu <[email protected]>
---
drivers/pci/pcie/portdrv_pci.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
index 35eca6277a96..c66a83f2c987 100644
--- a/drivers/pci/pcie/portdrv_pci.c
+++ b/drivers/pci/pcie/portdrv_pci.c
@@ -202,6 +202,8 @@ static struct pci_driver pcie_portdriver = {
.err_handler = &pcie_portdrv_err_handler,
+ .suppress_auto_claim_dma_owner = true,
+
.driver.pm = PCIE_PORTDRV_PM_OPS,
};
--
2.25.1
When an iommu group has DMA_OWNER_PRIVATE_DOMAIN_USER set for the first
time, it is a contract that the group could be assigned to userspace from
now on. The group must be detached from the default iommu domain and all
devices in this group are blocked from doing DMA until it is attached to a
user controlled iommu_domain. Correspondingly, the default domain should
be reattached after the last DMA_OWNER_USER is released.
Signed-off-by: Lu Baolu <[email protected]>
---
drivers/iommu/iommu.c | 42 +++++++++++++++++++++++++++++++++++++++---
1 file changed, 39 insertions(+), 3 deletions(-)
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 1de520a07518..0cba04a8ea3b 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -292,7 +292,12 @@ int iommu_probe_device(struct device *dev)
mutex_lock(&group->mutex);
iommu_alloc_default_domain(group, dev);
- if (group->default_domain) {
+ /*
+ * If any device in the group has been initialized for user dma,
+ * avoid attaching the default domain.
+ */
+ if (group->default_domain &&
+ group->dma_owner != DMA_OWNER_PRIVATE_DOMAIN_USER) {
ret = __iommu_attach_device(group->default_domain, dev);
if (ret) {
mutex_unlock(&group->mutex);
@@ -2324,7 +2329,7 @@ static int __iommu_attach_group(struct iommu_domain *domain,
{
int ret;
- if (group->default_domain && group->domain != group->default_domain)
+ if (group->domain && group->domain != group->default_domain)
return -EBUSY;
ret = __iommu_group_for_each_dev(group, domain,
@@ -2361,7 +2366,12 @@ static void __iommu_detach_group(struct iommu_domain *domain,
{
int ret;
- if (!group->default_domain) {
+ /*
+ * If any device in the group has been initialized for user dma,
+ * avoid re-attaching the default domain.
+ */
+ if (!group->default_domain ||
+ group->dma_owner == DMA_OWNER_PRIVATE_DOMAIN_USER) {
__iommu_group_for_each_dev(group, domain,
iommu_group_do_detach_device);
group->domain = NULL;
@@ -3371,6 +3381,23 @@ static int __iommu_group_set_dma_owner(struct iommu_group *group,
}
group->dma_owner = owner;
+
+ /*
+ * We must ensure that any device DMAs issued after this call
+ * are discarded. DMAs can only reach real memory once someone
+ * has attached a real domain.
+ */
+ if (owner == DMA_OWNER_PRIVATE_DOMAIN_USER) {
+ if (group->domain) {
+ if (group->domain != group->default_domain) {
+ group->dma_owner = DMA_OWNER_NONE;
+ return -EBUSY;
+ }
+
+ __iommu_detach_group(group->domain, group);
+ }
+ }
+
group->owner_cookie = owner_cookie;
refcount_set(&group->owner_cnt, 1);
@@ -3387,6 +3414,15 @@ static void __iommu_group_release_dma_owner(struct iommu_group *group,
return;
group->dma_owner = DMA_OWNER_NONE;
+
+ /*
+ * The UNMANAGED domain should be detached before all USER
+ * owners have been released.
+ */
+ if (owner == DMA_OWNER_PRIVATE_DOMAIN_USER) {
+ if (!WARN_ON(group->domain) && group->default_domain)
+ __iommu_attach_group(group->default_domain, group);
+ }
}
/**
--
2.25.1
The iommu_attach/detach_device() interfaces were exposed for the device
drivers to attach/detach their own domains. The commit <426a273834eae>
("iommu: Limit iommu_attach/detach_device to device with their own group")
restricted them to singleton groups to avoid different device in a group
attaching different domain.
As we've introduced device DMA ownership into the iommu core. We can now
introduce interfaces for muliple-device groups, and "all devices are in the
same address space" is still guaranteed.
The iommu_attach/detach_device_shared() could be used when multiple drivers
sharing the group claim the DMA_OWNER_PRIVATE_DOMAIN ownership. The first
call of iommu_attach_device_shared() attaches the domain to the group.
Other drivers could join it later. The domain will be detached from the
group after all drivers unjoin it.
Signed-off-by: Jason Gunthorpe <[email protected]>
Signed-off-by: Lu Baolu <[email protected]>
Tested-by: Dmitry Osipenko <[email protected]>
---
include/linux/iommu.h | 13 +++++++++
drivers/iommu/iommu.c | 61 +++++++++++++++++++++++++++++++++++++++++++
2 files changed, 74 insertions(+)
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index afcc07bc8d41..8c81ba11ae8c 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -705,6 +705,8 @@ int iommu_group_set_dma_owner(struct iommu_group *group, enum iommu_dma_owner ow
void *owner_cookie);
void iommu_group_release_dma_owner(struct iommu_group *group, enum iommu_dma_owner owner);
bool iommu_group_dma_owner_unclaimed(struct iommu_group *group);
+int iommu_attach_device_shared(struct iommu_domain *domain, struct device *dev);
+void iommu_detach_device_shared(struct iommu_domain *domain, struct device *dev);
#else /* CONFIG_IOMMU_API */
@@ -745,11 +747,22 @@ static inline int iommu_attach_device(struct iommu_domain *domain,
return -ENODEV;
}
+static inline int iommu_attach_device_shared(struct iommu_domain *domain,
+ struct device *dev)
+{
+ return -ENODEV;
+}
+
static inline void iommu_detach_device(struct iommu_domain *domain,
struct device *dev)
{
}
+static inline void iommu_detach_device_shared(struct iommu_domain *domain,
+ struct device *dev)
+{
+}
+
static inline struct iommu_domain *iommu_get_domain_for_dev(struct device *dev)
{
return NULL;
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 423197db99a9..6dc6e6ad6069 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -50,6 +50,7 @@ struct iommu_group {
struct list_head entry;
enum iommu_dma_owner dma_owner;
refcount_t owner_cnt;
+ refcount_t attach_cnt;
void *owner_cookie;
};
@@ -2026,6 +2027,41 @@ int iommu_attach_device(struct iommu_domain *domain, struct device *dev)
}
EXPORT_SYMBOL_GPL(iommu_attach_device);
+int iommu_attach_device_shared(struct iommu_domain *domain, struct device *dev)
+{
+ struct iommu_group *group;
+ int ret = 0;
+
+ group = iommu_group_get(dev);
+ if (!group)
+ return -ENODEV;
+
+ mutex_lock(&group->mutex);
+ if (refcount_inc_not_zero(&group->attach_cnt)) {
+ if (group->domain != domain ||
+ (group->dma_owner != DMA_OWNER_PRIVATE_DOMAIN &&
+ group->dma_owner != DMA_OWNER_PRIVATE_DOMAIN_USER)) {
+ refcount_dec(&group->attach_cnt);
+ ret = -EBUSY;
+ }
+
+ goto unlock_out;
+ }
+
+ ret = __iommu_attach_group(domain, group);
+ if (ret)
+ goto unlock_out;
+
+ refcount_set(&group->attach_cnt, 1);
+
+unlock_out:
+ mutex_unlock(&group->mutex);
+ iommu_group_put(group);
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(iommu_attach_device_shared);
+
int iommu_deferred_attach(struct device *dev, struct iommu_domain *domain)
{
const struct iommu_ops *ops = domain->ops;
@@ -2281,6 +2317,31 @@ void iommu_detach_device(struct iommu_domain *domain, struct device *dev)
}
EXPORT_SYMBOL_GPL(iommu_detach_device);
+void iommu_detach_device_shared(struct iommu_domain *domain, struct device *dev)
+{
+ struct iommu_group *group;
+
+ group = iommu_group_get(dev);
+ if (!group)
+ return;
+
+ mutex_lock(&group->mutex);
+ if (WARN_ON(group->domain != domain ||
+ (group->dma_owner != DMA_OWNER_PRIVATE_DOMAIN &&
+ group->dma_owner != DMA_OWNER_PRIVATE_DOMAIN_USER)))
+ goto unlock_out;
+
+ if (refcount_dec_and_test(&group->attach_cnt)) {
+ __iommu_detach_group(domain, group);
+ group->dma_owner = DMA_OWNER_NONE;
+ }
+
+unlock_out:
+ mutex_unlock(&group->mutex);
+ iommu_group_put(group);
+}
+EXPORT_SYMBOL_GPL(iommu_detach_device_shared);
+
struct iommu_domain *iommu_get_domain_for_dev(struct device *dev)
{
struct iommu_domain *domain;
--
2.25.1
The vfio needs to set DMA_OWNER_PRIVATE_DOMAIN_USER for the entire group
when attaching it to a vfio container. Expose group variants of setting/
releasing dma ownership for this purpose.
This also exposes the helper iommu_group_dma_owner_unclaimed() for vfio
to report to userspace if the group is viable to user assignment for
compatibility with VFIO_GROUP_FLAGS_VIABLE.
Signed-off-by: Lu Baolu <[email protected]>
---
include/linux/iommu.h | 21 ++++++++++++++++
drivers/iommu/iommu.c | 58 +++++++++++++++++++++++++++++++++++++++++++
2 files changed, 79 insertions(+)
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 24676b498f38..afcc07bc8d41 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -701,6 +701,10 @@ u32 iommu_sva_get_pasid(struct iommu_sva *handle);
int iommu_device_set_dma_owner(struct device *dev, enum iommu_dma_owner owner,
void *owner_cookie);
void iommu_device_release_dma_owner(struct device *dev, enum iommu_dma_owner owner);
+int iommu_group_set_dma_owner(struct iommu_group *group, enum iommu_dma_owner owner,
+ void *owner_cookie);
+void iommu_group_release_dma_owner(struct iommu_group *group, enum iommu_dma_owner owner);
+bool iommu_group_dma_owner_unclaimed(struct iommu_group *group);
#else /* CONFIG_IOMMU_API */
@@ -1117,6 +1121,23 @@ static inline void iommu_device_release_dma_owner(struct device *dev,
enum iommu_dma_owner owner)
{
}
+
+static inline int iommu_group_set_dma_owner(struct iommu_group *group,
+ enum iommu_dma_owner owner,
+ void *owner_cookie)
+{
+ return -EINVAL;
+}
+
+static inline void iommu_group_release_dma_owner(struct iommu_group *group,
+ enum iommu_dma_owner owner)
+{
+}
+
+static inline bool iommu_group_dma_owner_unclaimed(struct iommu_group *group)
+{
+ return false;
+}
#endif /* CONFIG_IOMMU_API */
/**
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 0cba04a8ea3b..423197db99a9 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -3425,6 +3425,64 @@ static void __iommu_group_release_dma_owner(struct iommu_group *group,
}
}
+/**
+ * iommu_group_set_dma_owner() - Set DMA ownership of a group
+ * @group: The group.
+ * @owner: DMA owner type.
+ * @owner_cookie: Caller specified pointer. Could be used for exclusive
+ * declaration. Could be NULL.
+ *
+ * This is to support backward compatibility for legacy vfio which manages
+ * dma ownership in group level. New invocations on this interface should be
+ * prohibited. Instead, please turn to iommu_device_set_dma_owner().
+ */
+int iommu_group_set_dma_owner(struct iommu_group *group, enum iommu_dma_owner owner,
+ void *owner_cookie)
+{
+ int ret;
+
+ mutex_lock(&group->mutex);
+ ret = __iommu_group_set_dma_owner(group, owner, owner_cookie);
+ mutex_unlock(&group->mutex);
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(iommu_group_set_dma_owner);
+
+/**
+ * iommu_group_release_dma_owner() - Release DMA ownership of a group
+ * @group: The group.
+ * @owner: DMA owner type.
+ *
+ * Release the DMA ownership claimed by iommu_group_set_dma_owner().
+ */
+void iommu_group_release_dma_owner(struct iommu_group *group, enum iommu_dma_owner owner)
+{
+ mutex_lock(&group->mutex);
+ __iommu_group_release_dma_owner(group, owner);
+ mutex_unlock(&group->mutex);
+}
+EXPORT_SYMBOL_GPL(iommu_group_release_dma_owner);
+
+/**
+ * iommu_group_dma_owner_unclaimed() - Is group dma ownership claimed
+ * @group: The group.
+ *
+ * This provides status check on a given group. It is racey and only for
+ * non-binding status reporting.
+ */
+bool iommu_group_dma_owner_unclaimed(struct iommu_group *group)
+{
+ enum iommu_dma_owner owner;
+
+ mutex_lock(&group->mutex);
+ owner = group->dma_owner;
+ mutex_unlock(&group->mutex);
+
+ return owner == DMA_OWNER_NONE;
+}
+EXPORT_SYMBOL_GPL(iommu_group_dma_owner_unclaimed);
+
/**
* iommu_device_set_dma_owner() - Set DMA ownership of a device
* @dev: The device.
--
2.25.1
As DMA USER ownership is claimed for the iommu group when a vfio group is
added to a vfio container, the vfio group viability is guaranteed as long
as group->container_users > 0. Remove those unnecessary group viability
checks which are only hit when group->container_users is not zero.
The only remaining reference is in GROUP_GET_STATUS, which could be called
at any time when group fd is valid. Here we just replace the
vfio_group_viable() by directly calling iommu core to get viability status.
Signed-off-by: Lu Baolu <[email protected]>
---
drivers/vfio/vfio.c | 18 ++++++------------
1 file changed, 6 insertions(+), 12 deletions(-)
diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 00f89acfec39..63e24f746b82 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -1316,12 +1316,6 @@ static int vfio_group_set_container(struct vfio_group *group, int container_fd)
return ret;
}
-static bool vfio_group_viable(struct vfio_group *group)
-{
- return (iommu_group_for_each_dev(group->iommu_group,
- group, vfio_dev_viable) == 0);
-}
-
static int vfio_group_add_container_user(struct vfio_group *group)
{
if (!atomic_inc_not_zero(&group->container_users))
@@ -1331,7 +1325,7 @@ static int vfio_group_add_container_user(struct vfio_group *group)
atomic_dec(&group->container_users);
return -EPERM;
}
- if (!group->container->iommu_driver || !vfio_group_viable(group)) {
+ if (!group->container->iommu_driver) {
atomic_dec(&group->container_users);
return -EINVAL;
}
@@ -1349,7 +1343,7 @@ static int vfio_group_get_device_fd(struct vfio_group *group, char *buf)
int ret = 0;
if (0 == atomic_read(&group->container_users) ||
- !group->container->iommu_driver || !vfio_group_viable(group))
+ !group->container->iommu_driver)
return -EINVAL;
if (group->type == VFIO_NO_IOMMU && !capable(CAP_SYS_RAWIO))
@@ -1441,11 +1435,11 @@ static long vfio_group_fops_unl_ioctl(struct file *filep,
status.flags = 0;
- if (vfio_group_viable(group))
- status.flags |= VFIO_GROUP_FLAGS_VIABLE;
-
if (group->container)
- status.flags |= VFIO_GROUP_FLAGS_CONTAINER_SET;
+ status.flags |= VFIO_GROUP_FLAGS_CONTAINER_SET |
+ VFIO_GROUP_FLAGS_VIABLE;
+ else if (iommu_group_dma_owner_unclaimed(group->iommu_group))
+ status.flags |= VFIO_GROUP_FLAGS_VIABLE;
if (copy_to_user((void __user *)arg, &status, minsz))
return -EFAULT;
--
2.25.1
Set DMA_OWNER_PRIVATE_DOMAIN_USER when an iommu group is set to a
container, and release DMA_OWNER_USER once the iommu group is unset
from a container.
Signed-off-by: Lu Baolu <[email protected]>
---
drivers/vfio/fsl-mc/vfio_fsl_mc.c | 1 +
drivers/vfio/pci/vfio_pci.c | 1 +
drivers/vfio/platform/vfio_amba.c | 1 +
drivers/vfio/platform/vfio_platform.c | 1 +
drivers/vfio/vfio.c | 13 ++++++++++++-
5 files changed, 16 insertions(+), 1 deletion(-)
diff --git a/drivers/vfio/fsl-mc/vfio_fsl_mc.c b/drivers/vfio/fsl-mc/vfio_fsl_mc.c
index 6e2e62c6f47a..5f36ffbb07d1 100644
--- a/drivers/vfio/fsl-mc/vfio_fsl_mc.c
+++ b/drivers/vfio/fsl-mc/vfio_fsl_mc.c
@@ -588,6 +588,7 @@ static struct fsl_mc_driver vfio_fsl_mc_driver = {
.name = "vfio-fsl-mc",
.owner = THIS_MODULE,
},
+ .suppress_auto_claim_dma_owner = true,
};
static int __init vfio_fsl_mc_driver_init(void)
diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index a5ce92beb655..31810b074aa4 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -193,6 +193,7 @@ static struct pci_driver vfio_pci_driver = {
.remove = vfio_pci_remove,
.sriov_configure = vfio_pci_sriov_configure,
.err_handler = &vfio_pci_core_err_handlers,
+ .suppress_auto_claim_dma_owner = true,
};
static void __init vfio_pci_fill_ids(void)
diff --git a/drivers/vfio/platform/vfio_amba.c b/drivers/vfio/platform/vfio_amba.c
index badfffea14fb..e598d6af90ad 100644
--- a/drivers/vfio/platform/vfio_amba.c
+++ b/drivers/vfio/platform/vfio_amba.c
@@ -95,6 +95,7 @@ static struct amba_driver vfio_amba_driver = {
.name = "vfio-amba",
.owner = THIS_MODULE,
},
+ .suppress_auto_claim_dma_owner = true,
};
module_amba_driver(vfio_amba_driver);
diff --git a/drivers/vfio/platform/vfio_platform.c b/drivers/vfio/platform/vfio_platform.c
index 68a1c87066d7..8bda68775b97 100644
--- a/drivers/vfio/platform/vfio_platform.c
+++ b/drivers/vfio/platform/vfio_platform.c
@@ -76,6 +76,7 @@ static struct platform_driver vfio_platform_driver = {
.driver = {
.name = "vfio-platform",
},
+ .suppress_auto_claim_dma_owner = true,
};
module_platform_driver(vfio_platform_driver);
diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 82fb75464f92..00f89acfec39 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -1198,6 +1198,9 @@ static void __vfio_group_unset_container(struct vfio_group *group)
driver->ops->detach_group(container->iommu_data,
group->iommu_group);
+ iommu_group_release_dma_owner(group->iommu_group,
+ DMA_OWNER_PRIVATE_DOMAIN_USER);
+
group->container = NULL;
wake_up(&group->container_q);
list_del(&group->container_next);
@@ -1282,13 +1285,21 @@ static int vfio_group_set_container(struct vfio_group *group, int container_fd)
goto unlock_out;
}
+ ret = iommu_group_set_dma_owner(group->iommu_group,
+ DMA_OWNER_PRIVATE_DOMAIN_USER, f.file);
+ if (ret)
+ goto unlock_out;
+
driver = container->iommu_driver;
if (driver) {
ret = driver->ops->attach_group(container->iommu_data,
group->iommu_group,
group->type);
- if (ret)
+ if (ret) {
+ iommu_group_release_dma_owner(group->iommu_group,
+ DMA_OWNER_PRIVATE_DOMAIN_USER);
goto unlock_out;
+ }
}
group->container = container;
--
2.25.1
From: Jason Gunthorpe <[email protected]>
commit 60720a0fc646 ("vfio: Add device tracking during unbind") added the
unbound list to plug a problem with KVM where KVM_DEV_VFIO_GROUP_DEL
relied on vfio_group_get_external_user() succeeding to return the
vfio_group from a group file descriptor. The unbound list allowed
vfio_group_get_external_user() to continue to succeed in edge cases.
However commit 5d6dee80a1e9 ("vfio: New external user group/file match")
deleted the call to vfio_group_get_external_user() during
KVM_DEV_VFIO_GROUP_DEL. Instead vfio_external_group_match_file() is used
to directly match the file descriptor to the group pointer.
This in turn avoids the call down to vfio_dev_viable() during
KVM_DEV_VFIO_GROUP_DEL and also avoids the trouble the first commit was
trying to fix.
There are no other users of vfio_dev_viable() that care about the time
after vfio_unregister_group_dev() returns, so simply delete the
unbound_list entirely.
Reviewed-by: Chaitanya Kulkarni <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
Signed-off-by: Lu Baolu <[email protected]>
---
drivers/vfio/vfio.c | 74 ++-------------------------------------------
1 file changed, 2 insertions(+), 72 deletions(-)
diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 63e24f746b82..5c81346367b1 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -62,11 +62,6 @@ struct vfio_container {
bool noiommu;
};
-struct vfio_unbound_dev {
- struct device *dev;
- struct list_head unbound_next;
-};
-
struct vfio_group {
struct device dev;
struct cdev cdev;
@@ -79,8 +74,6 @@ struct vfio_group {
struct notifier_block nb;
struct list_head vfio_next;
struct list_head container_next;
- struct list_head unbound_list;
- struct mutex unbound_lock;
atomic_t opened;
wait_queue_head_t container_q;
enum vfio_group_type type;
@@ -340,16 +333,8 @@ vfio_group_get_from_iommu(struct iommu_group *iommu_group)
static void vfio_group_release(struct device *dev)
{
struct vfio_group *group = container_of(dev, struct vfio_group, dev);
- struct vfio_unbound_dev *unbound, *tmp;
-
- list_for_each_entry_safe(unbound, tmp,
- &group->unbound_list, unbound_next) {
- list_del(&unbound->unbound_next);
- kfree(unbound);
- }
mutex_destroy(&group->device_lock);
- mutex_destroy(&group->unbound_lock);
iommu_group_put(group->iommu_group);
ida_free(&vfio.group_ida, MINOR(group->dev.devt));
kfree(group);
@@ -381,8 +366,6 @@ static struct vfio_group *vfio_group_alloc(struct iommu_group *iommu_group,
refcount_set(&group->users, 1);
INIT_LIST_HEAD(&group->device_list);
mutex_init(&group->device_lock);
- INIT_LIST_HEAD(&group->unbound_list);
- mutex_init(&group->unbound_lock);
init_waitqueue_head(&group->container_q);
group->iommu_group = iommu_group;
/* put in vfio_group_release() */
@@ -571,19 +554,8 @@ static int vfio_dev_viable(struct device *dev, void *data)
struct vfio_group *group = data;
struct vfio_device *device;
struct device_driver *drv = READ_ONCE(dev->driver);
- struct vfio_unbound_dev *unbound;
- int ret = -EINVAL;
- mutex_lock(&group->unbound_lock);
- list_for_each_entry(unbound, &group->unbound_list, unbound_next) {
- if (dev == unbound->dev) {
- ret = 0;
- break;
- }
- }
- mutex_unlock(&group->unbound_lock);
-
- if (!ret || !drv || vfio_dev_driver_allowed(dev, drv))
+ if (!drv || vfio_dev_driver_allowed(dev, drv))
return 0;
device = vfio_group_get_device(group, dev);
@@ -592,7 +564,7 @@ static int vfio_dev_viable(struct device *dev, void *data)
return 0;
}
- return ret;
+ return -EINVAL;
}
/**
@@ -634,7 +606,6 @@ static int vfio_iommu_group_notifier(struct notifier_block *nb,
{
struct vfio_group *group = container_of(nb, struct vfio_group, nb);
struct device *dev = data;
- struct vfio_unbound_dev *unbound;
switch (action) {
case IOMMU_GROUP_NOTIFY_ADD_DEVICE:
@@ -663,28 +634,6 @@ static int vfio_iommu_group_notifier(struct notifier_block *nb,
__func__, iommu_group_id(group->iommu_group),
dev->driver->name);
break;
- case IOMMU_GROUP_NOTIFY_UNBOUND_DRIVER:
- dev_dbg(dev, "%s: group %d unbound from driver\n", __func__,
- iommu_group_id(group->iommu_group));
- /*
- * XXX An unbound device in a live group is ok, but we'd
- * really like to avoid the above BUG_ON by preventing other
- * drivers from binding to it. Once that occurs, we have to
- * stop the system to maintain isolation. At a minimum, we'd
- * want a toggle to disable driver auto probe for this device.
- */
-
- mutex_lock(&group->unbound_lock);
- list_for_each_entry(unbound,
- &group->unbound_list, unbound_next) {
- if (dev == unbound->dev) {
- list_del(&unbound->unbound_next);
- kfree(unbound);
- break;
- }
- }
- mutex_unlock(&group->unbound_lock);
- break;
}
return NOTIFY_OK;
}
@@ -889,29 +838,10 @@ static struct vfio_device *vfio_device_get_from_name(struct vfio_group *group,
void vfio_unregister_group_dev(struct vfio_device *device)
{
struct vfio_group *group = device->group;
- struct vfio_unbound_dev *unbound;
unsigned int i = 0;
bool interrupted = false;
long rc;
- /*
- * When the device is removed from the group, the group suddenly
- * becomes non-viable; the device has a driver (until the unbind
- * completes), but it's not present in the group. This is bad news
- * for any external users that need to re-acquire a group reference
- * in order to match and release their existing reference. To
- * solve this, we track such devices on the unbound_list to bridge
- * the gap until they're fully unbound.
- */
- unbound = kzalloc(sizeof(*unbound), GFP_KERNEL);
- if (unbound) {
- unbound->dev = device->dev;
- mutex_lock(&group->unbound_lock);
- list_add(&unbound->unbound_next, &group->unbound_list);
- mutex_unlock(&group->unbound_lock);
- }
- WARN_ON(!unbound);
-
vfio_device_put(device);
rc = try_wait_for_completion(&device->comp);
while (rc <= 0) {
--
2.25.1
The iommu group changes notifer is not referenced in the tree. Remove it
to avoid dead code.
Suggested-by: Christoph Hellwig <[email protected]>
Signed-off-by: Lu Baolu <[email protected]>
---
include/linux/iommu.h | 23 -------------
drivers/iommu/iommu.c | 75 -------------------------------------------
2 files changed, 98 deletions(-)
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 8c81ba11ae8c..e7eeac801bcd 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -419,13 +419,6 @@ static inline void iommu_iotlb_gather_init(struct iommu_iotlb_gather *gather)
};
}
-#define IOMMU_GROUP_NOTIFY_ADD_DEVICE 1 /* Device added */
-#define IOMMU_GROUP_NOTIFY_DEL_DEVICE 2 /* Pre Device removed */
-#define IOMMU_GROUP_NOTIFY_BIND_DRIVER 3 /* Pre Driver bind */
-#define IOMMU_GROUP_NOTIFY_BOUND_DRIVER 4 /* Post Driver bind */
-#define IOMMU_GROUP_NOTIFY_UNBIND_DRIVER 5 /* Pre Driver unbind */
-#define IOMMU_GROUP_NOTIFY_UNBOUND_DRIVER 6 /* Post Driver unbind */
-
extern int bus_set_iommu(struct bus_type *bus, const struct iommu_ops *ops);
extern int bus_iommu_probe(struct bus_type *bus);
extern bool iommu_present(struct bus_type *bus);
@@ -498,10 +491,6 @@ extern int iommu_group_for_each_dev(struct iommu_group *group, void *data,
extern struct iommu_group *iommu_group_get(struct device *dev);
extern struct iommu_group *iommu_group_ref_get(struct iommu_group *group);
extern void iommu_group_put(struct iommu_group *group);
-extern int iommu_group_register_notifier(struct iommu_group *group,
- struct notifier_block *nb);
-extern int iommu_group_unregister_notifier(struct iommu_group *group,
- struct notifier_block *nb);
extern int iommu_register_device_fault_handler(struct device *dev,
iommu_dev_fault_handler_t handler,
void *data);
@@ -915,18 +904,6 @@ static inline void iommu_group_put(struct iommu_group *group)
{
}
-static inline int iommu_group_register_notifier(struct iommu_group *group,
- struct notifier_block *nb)
-{
- return -ENODEV;
-}
-
-static inline int iommu_group_unregister_notifier(struct iommu_group *group,
- struct notifier_block *nb)
-{
- return 0;
-}
-
static inline
int iommu_register_device_fault_handler(struct device *dev,
iommu_dev_fault_handler_t handler,
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 6dc6e6ad6069..6f122e656ccb 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -18,7 +18,6 @@
#include <linux/errno.h>
#include <linux/iommu.h>
#include <linux/idr.h>
-#include <linux/notifier.h>
#include <linux/err.h>
#include <linux/pci.h>
#include <linux/bitops.h>
@@ -40,7 +39,6 @@ struct iommu_group {
struct kobject *devices_kobj;
struct list_head devices;
struct mutex mutex;
- struct blocking_notifier_head notifier;
void *iommu_data;
void (*iommu_data_release)(void *iommu_data);
char *name;
@@ -629,7 +627,6 @@ struct iommu_group *iommu_group_alloc(void)
mutex_init(&group->mutex);
INIT_LIST_HEAD(&group->devices);
INIT_LIST_HEAD(&group->entry);
- BLOCKING_INIT_NOTIFIER_HEAD(&group->notifier);
group->dma_owner = DMA_OWNER_NONE;
ret = ida_simple_get(&iommu_group_ida, 0, 0, GFP_KERNEL);
@@ -905,10 +902,6 @@ int iommu_group_add_device(struct iommu_group *group, struct device *dev)
if (ret)
goto err_put_group;
- /* Notify any listeners about change to group. */
- blocking_notifier_call_chain(&group->notifier,
- IOMMU_GROUP_NOTIFY_ADD_DEVICE, dev);
-
trace_add_device_to_group(group->id, dev);
dev_info(dev, "Adding to iommu group %d\n", group->id);
@@ -950,10 +943,6 @@ void iommu_group_remove_device(struct device *dev)
dev_info(dev, "Removing from iommu group %d\n", group->id);
- /* Pre-notify listeners that a device is being removed. */
- blocking_notifier_call_chain(&group->notifier,
- IOMMU_GROUP_NOTIFY_DEL_DEVICE, dev);
-
mutex_lock(&group->mutex);
list_for_each_entry(tmp_device, &group->devices, list) {
if (tmp_device->dev == dev) {
@@ -1076,36 +1065,6 @@ void iommu_group_put(struct iommu_group *group)
}
EXPORT_SYMBOL_GPL(iommu_group_put);
-/**
- * iommu_group_register_notifier - Register a notifier for group changes
- * @group: the group to watch
- * @nb: notifier block to signal
- *
- * This function allows iommu group users to track changes in a group.
- * See include/linux/iommu.h for actions sent via this notifier. Caller
- * should hold a reference to the group throughout notifier registration.
- */
-int iommu_group_register_notifier(struct iommu_group *group,
- struct notifier_block *nb)
-{
- return blocking_notifier_chain_register(&group->notifier, nb);
-}
-EXPORT_SYMBOL_GPL(iommu_group_register_notifier);
-
-/**
- * iommu_group_unregister_notifier - Unregister a notifier
- * @group: the group to watch
- * @nb: notifier block to signal
- *
- * Unregister a previously registered group notifier block.
- */
-int iommu_group_unregister_notifier(struct iommu_group *group,
- struct notifier_block *nb)
-{
- return blocking_notifier_chain_unregister(&group->notifier, nb);
-}
-EXPORT_SYMBOL_GPL(iommu_group_unregister_notifier);
-
/**
* iommu_register_device_fault_handler() - Register a device fault handler
* @dev: the device
@@ -1654,14 +1613,8 @@ static int remove_iommu_group(struct device *dev, void *data)
static int iommu_bus_notifier(struct notifier_block *nb,
unsigned long action, void *data)
{
- unsigned long group_action = 0;
struct device *dev = data;
- struct iommu_group *group;
- /*
- * ADD/DEL call into iommu driver ops if provided, which may
- * result in ADD/DEL notifiers to group->notifier
- */
if (action == BUS_NOTIFY_ADD_DEVICE) {
int ret;
@@ -1672,34 +1625,6 @@ static int iommu_bus_notifier(struct notifier_block *nb,
return NOTIFY_OK;
}
- /*
- * Remaining BUS_NOTIFYs get filtered and republished to the
- * group, if anyone is listening
- */
- group = iommu_group_get(dev);
- if (!group)
- return 0;
-
- switch (action) {
- case BUS_NOTIFY_BIND_DRIVER:
- group_action = IOMMU_GROUP_NOTIFY_BIND_DRIVER;
- break;
- case BUS_NOTIFY_BOUND_DRIVER:
- group_action = IOMMU_GROUP_NOTIFY_BOUND_DRIVER;
- break;
- case BUS_NOTIFY_UNBIND_DRIVER:
- group_action = IOMMU_GROUP_NOTIFY_UNBIND_DRIVER;
- break;
- case BUS_NOTIFY_UNBOUND_DRIVER:
- group_action = IOMMU_GROUP_NOTIFY_UNBOUND_DRIVER;
- break;
- }
-
- if (group_action)
- blocking_notifier_call_chain(&group->notifier,
- group_action, dev);
-
- iommu_group_put(group);
return 0;
}
--
2.25.1
From: Jason Gunthorpe <[email protected]>
Tegra joins many platform devices onto the same iommu_domain and builds
sort-of a DMA API on top of it.
Given that iommu_attach/detatch_device_shared() has supported this usage
model. Each device that wants to use the special domain will use
suppress_auto_claim_dma_owner and call iommu_attach_device_shared() which
will use dma owner framework to lock out other usages of the group and
refcount the domain attachment.
When the last device calls detatch the domain will be disconnected.
Signed-off-by: Jason Gunthorpe <[email protected]>
Signed-off-by: Lu Baolu <[email protected]>
Tested-by: Dmitry Osipenko <[email protected]> # Nexus7 T30
---
drivers/gpu/drm/tegra/dc.c | 1 +
drivers/gpu/drm/tegra/drm.c | 55 ++++++++++++++++++------------------
drivers/gpu/drm/tegra/gr2d.c | 1 +
drivers/gpu/drm/tegra/gr3d.c | 1 +
drivers/gpu/drm/tegra/vic.c | 1 +
5 files changed, 31 insertions(+), 28 deletions(-)
diff --git a/drivers/gpu/drm/tegra/dc.c b/drivers/gpu/drm/tegra/dc.c
index a29d64f87563..3a2458f56fbc 100644
--- a/drivers/gpu/drm/tegra/dc.c
+++ b/drivers/gpu/drm/tegra/dc.c
@@ -3111,4 +3111,5 @@ struct platform_driver tegra_dc_driver = {
},
.probe = tegra_dc_probe,
.remove = tegra_dc_remove,
+ .suppress_auto_claim_dma_owner = true,
};
diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c
index 8d37d6b00562..1e83e3f5da5b 100644
--- a/drivers/gpu/drm/tegra/drm.c
+++ b/drivers/gpu/drm/tegra/drm.c
@@ -928,12 +928,15 @@ int tegra_drm_unregister_client(struct tegra_drm *tegra,
return 0;
}
+/*
+ * Clients which use this function must set suppress_auto_claim_dma_owner in
+ * their platform_driver's device_driver struct.
+ */
int host1x_client_iommu_attach(struct host1x_client *client)
{
struct iommu_domain *domain = iommu_get_domain_for_dev(client->dev);
struct drm_device *drm = dev_get_drvdata(client->host);
struct tegra_drm *tegra = drm->dev_private;
- struct iommu_group *group = NULL;
int err;
/*
@@ -941,48 +944,44 @@ int host1x_client_iommu_attach(struct host1x_client *client)
* not the shared IOMMU domain, don't try to attach it to a different
* domain. This allows using the IOMMU-backed DMA API.
*/
- if (domain && domain != tegra->domain)
+ client->group = NULL;
+ if (!client->dev->iommu_group || (domain && domain != tegra->domain))
+ return iommu_device_set_dma_owner(client->dev,
+ DMA_OWNER_DMA_API, NULL);
+
+ if (!tegra->domain)
return 0;
- if (tegra->domain) {
- group = iommu_group_get(client->dev);
- if (!group)
- return -ENODEV;
-
- if (domain != tegra->domain) {
- err = iommu_attach_group(tegra->domain, group);
- if (err < 0) {
- iommu_group_put(group);
- return err;
- }
- }
+ err = iommu_device_set_dma_owner(client->dev,
+ DMA_OWNER_PRIVATE_DOMAIN, NULL);
+ if (err)
+ return err;
- tegra->use_explicit_iommu = true;
+ err = iommu_attach_device_shared(tegra->domain, client->dev);
+ if (err) {
+ iommu_device_release_dma_owner(client->dev,
+ DMA_OWNER_PRIVATE_DOMAIN);
+ return err;
}
- client->group = group;
-
+ tegra->use_explicit_iommu = true;
+ client->group = client->dev->iommu_group;
return 0;
}
void host1x_client_iommu_detach(struct host1x_client *client)
{
+ struct iommu_domain *domain = iommu_get_domain_for_dev(client->dev);
struct drm_device *drm = dev_get_drvdata(client->host);
struct tegra_drm *tegra = drm->dev_private;
- struct iommu_domain *domain;
if (client->group) {
- /*
- * Devices that are part of the same group may no longer be
- * attached to a domain at this point because their group may
- * have been detached by an earlier client.
- */
- domain = iommu_get_domain_for_dev(client->dev);
- if (domain)
- iommu_detach_group(tegra->domain, client->group);
-
- iommu_group_put(client->group);
+ iommu_detach_device_shared(tegra->domain, client->dev);
+ iommu_device_release_dma_owner(client->dev,
+ DMA_OWNER_PRIVATE_DOMAIN);
client->group = NULL;
+ } else {
+ iommu_device_release_dma_owner(client->dev, DMA_OWNER_DMA_API);
}
}
diff --git a/drivers/gpu/drm/tegra/gr2d.c b/drivers/gpu/drm/tegra/gr2d.c
index de288cba3905..8d92141c3c86 100644
--- a/drivers/gpu/drm/tegra/gr2d.c
+++ b/drivers/gpu/drm/tegra/gr2d.c
@@ -271,4 +271,5 @@ struct platform_driver tegra_gr2d_driver = {
},
.probe = gr2d_probe,
.remove = gr2d_remove,
+ .suppress_auto_claim_dma_owner = true,
};
diff --git a/drivers/gpu/drm/tegra/gr3d.c b/drivers/gpu/drm/tegra/gr3d.c
index 24442ade0da3..33c4954977ab 100644
--- a/drivers/gpu/drm/tegra/gr3d.c
+++ b/drivers/gpu/drm/tegra/gr3d.c
@@ -400,4 +400,5 @@ struct platform_driver tegra_gr3d_driver = {
},
.probe = gr3d_probe,
.remove = gr3d_remove,
+ .suppress_auto_claim_dma_owner = true,
};
diff --git a/drivers/gpu/drm/tegra/vic.c b/drivers/gpu/drm/tegra/vic.c
index c02010ff2b7f..114df067ea00 100644
--- a/drivers/gpu/drm/tegra/vic.c
+++ b/drivers/gpu/drm/tegra/vic.c
@@ -527,6 +527,7 @@ struct platform_driver tegra_vic_driver = {
},
.probe = vic_probe,
.remove = vic_remove,
+ .suppress_auto_claim_dma_owner = true,
};
#if IS_ENABLED(CONFIG_ARCH_TEGRA_124_SOC)
--
2.25.1
The iommu core and driver core have been enhanced to avoid unsafe driver
binding to a live group after iommu_group_set_dma_owner(PRIVATE_USER)
has been called. There's no need to register iommu group notifier. This
removes the iommu group notifer which contains BUG_ON() and WARN().
Signed-off-by: Lu Baolu <[email protected]>
---
drivers/vfio/vfio.c | 147 --------------------------------------------
1 file changed, 147 deletions(-)
diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 5c81346367b1..33d984ff3cc5 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -71,7 +71,6 @@ struct vfio_group {
struct vfio_container *container;
struct list_head device_list;
struct mutex device_lock;
- struct notifier_block nb;
struct list_head vfio_next;
struct list_head container_next;
atomic_t opened;
@@ -274,8 +273,6 @@ void vfio_unregister_iommu_driver(const struct vfio_iommu_driver_ops *ops)
}
EXPORT_SYMBOL_GPL(vfio_unregister_iommu_driver);
-static int vfio_iommu_group_notifier(struct notifier_block *nb,
- unsigned long action, void *data);
static void vfio_group_get(struct vfio_group *group);
/**
@@ -395,13 +392,6 @@ static struct vfio_group *vfio_create_group(struct iommu_group *iommu_group,
goto err_put;
}
- group->nb.notifier_call = vfio_iommu_group_notifier;
- err = iommu_group_register_notifier(iommu_group, &group->nb);
- if (err) {
- ret = ERR_PTR(err);
- goto err_put;
- }
-
mutex_lock(&vfio.group_lock);
/* Did we race creating this group? */
@@ -422,7 +412,6 @@ static struct vfio_group *vfio_create_group(struct iommu_group *iommu_group,
err_unlock:
mutex_unlock(&vfio.group_lock);
- iommu_group_unregister_notifier(group->iommu_group, &group->nb);
err_put:
put_device(&group->dev);
return ret;
@@ -447,7 +436,6 @@ static void vfio_group_put(struct vfio_group *group)
cdev_device_del(&group->cdev, &group->dev);
mutex_unlock(&vfio.group_lock);
- iommu_group_unregister_notifier(group->iommu_group, &group->nb);
put_device(&group->dev);
}
@@ -503,141 +491,6 @@ static struct vfio_device *vfio_group_get_device(struct vfio_group *group,
return NULL;
}
-/*
- * Some drivers, like pci-stub, are only used to prevent other drivers from
- * claiming a device and are therefore perfectly legitimate for a user owned
- * group. The pci-stub driver has no dependencies on DMA or the IOVA mapping
- * of the device, but it does prevent the user from having direct access to
- * the device, which is useful in some circumstances.
- *
- * We also assume that we can include PCI interconnect devices, ie. bridges.
- * IOMMU grouping on PCI necessitates that if we lack isolation on a bridge
- * then all of the downstream devices will be part of the same IOMMU group as
- * the bridge. Thus, if placing the bridge into the user owned IOVA space
- * breaks anything, it only does so for user owned devices downstream. Note
- * that error notification via MSI can be affected for platforms that handle
- * MSI within the same IOVA space as DMA.
- */
-static const char * const vfio_driver_allowed[] = { "pci-stub" };
-
-static bool vfio_dev_driver_allowed(struct device *dev,
- struct device_driver *drv)
-{
- if (dev_is_pci(dev)) {
- struct pci_dev *pdev = to_pci_dev(dev);
-
- if (pdev->hdr_type != PCI_HEADER_TYPE_NORMAL)
- return true;
- }
-
- return match_string(vfio_driver_allowed,
- ARRAY_SIZE(vfio_driver_allowed),
- drv->name) >= 0;
-}
-
-/*
- * A vfio group is viable for use by userspace if all devices are in
- * one of the following states:
- * - driver-less
- * - bound to a vfio driver
- * - bound to an otherwise allowed driver
- * - a PCI interconnect device
- *
- * We use two methods to determine whether a device is bound to a vfio
- * driver. The first is to test whether the device exists in the vfio
- * group. The second is to test if the device exists on the group
- * unbound_list, indicating it's in the middle of transitioning from
- * a vfio driver to driver-less.
- */
-static int vfio_dev_viable(struct device *dev, void *data)
-{
- struct vfio_group *group = data;
- struct vfio_device *device;
- struct device_driver *drv = READ_ONCE(dev->driver);
-
- if (!drv || vfio_dev_driver_allowed(dev, drv))
- return 0;
-
- device = vfio_group_get_device(group, dev);
- if (device) {
- vfio_device_put(device);
- return 0;
- }
-
- return -EINVAL;
-}
-
-/**
- * Async device support
- */
-static int vfio_group_nb_add_dev(struct vfio_group *group, struct device *dev)
-{
- struct vfio_device *device;
-
- /* Do we already know about it? We shouldn't */
- device = vfio_group_get_device(group, dev);
- if (WARN_ON_ONCE(device)) {
- vfio_device_put(device);
- return 0;
- }
-
- /* Nothing to do for idle groups */
- if (!atomic_read(&group->container_users))
- return 0;
-
- /* TODO Prevent device auto probing */
- dev_WARN(dev, "Device added to live group %d!\n",
- iommu_group_id(group->iommu_group));
-
- return 0;
-}
-
-static int vfio_group_nb_verify(struct vfio_group *group, struct device *dev)
-{
- /* We don't care what happens when the group isn't in use */
- if (!atomic_read(&group->container_users))
- return 0;
-
- return vfio_dev_viable(dev, group);
-}
-
-static int vfio_iommu_group_notifier(struct notifier_block *nb,
- unsigned long action, void *data)
-{
- struct vfio_group *group = container_of(nb, struct vfio_group, nb);
- struct device *dev = data;
-
- switch (action) {
- case IOMMU_GROUP_NOTIFY_ADD_DEVICE:
- vfio_group_nb_add_dev(group, dev);
- break;
- case IOMMU_GROUP_NOTIFY_DEL_DEVICE:
- /*
- * Nothing to do here. If the device is in use, then the
- * vfio sub-driver should block the remove callback until
- * it is unused. If the device is unused or attached to a
- * stub driver, then it should be released and we don't
- * care that it will be going away.
- */
- break;
- case IOMMU_GROUP_NOTIFY_BIND_DRIVER:
- dev_dbg(dev, "%s: group %d binding to driver\n", __func__,
- iommu_group_id(group->iommu_group));
- break;
- case IOMMU_GROUP_NOTIFY_BOUND_DRIVER:
- dev_dbg(dev, "%s: group %d bound to driver %s\n", __func__,
- iommu_group_id(group->iommu_group), dev->driver->name);
- BUG_ON(vfio_group_nb_verify(group, dev));
- break;
- case IOMMU_GROUP_NOTIFY_UNBIND_DRIVER:
- dev_dbg(dev, "%s: group %d unbinding from driver %s\n",
- __func__, iommu_group_id(group->iommu_group),
- dev->driver->name);
- break;
- }
- return NOTIFY_OK;
-}
-
/**
* VFIO driver API
*/
--
2.25.1
On Mon, Dec 06, 2021 at 09:58:48AM +0800, Lu Baolu wrote:
> The platform_dma_configure() is shared between platform and amba bus
> drivers. Rename the common helper to firmware_dma_configure() so that
> both platform and amba bus drivers could customize their dma_configure
> callbacks.
Please, if you are going to call these functions "firmware_" then move
them to the drivers/firmware/ location, they do not belong in
drivers/base/platform.c anymore, right?
thanks,
greg k-h
On Mon, Dec 06, 2021 at 09:58:49AM +0800, Lu Baolu wrote:
> Multiple platform devices may be placed in the same IOMMU group because
> they cannot be isolated from each other. These devices must either be
> entirely under kernel control or userspace control, never a mixture. This
> checks and sets DMA ownership during driver binding, and release the
> ownership during driver unbinding.
>
> Driver may set a new flag (suppress_auto_claim_dma_owner) to disable auto
> claiming DMA_OWNER_DMA_API ownership in the binding process. For instance,
> the userspace framework drivers (vfio etc.) which need to manually claim
> DMA_OWNER_PRIVATE_DOMAIN_USER when assigning a device to userspace.
>
> Signed-off-by: Lu Baolu <[email protected]>
> ---
> include/linux/platform_device.h | 1 +
> drivers/base/platform.c | 30 +++++++++++++++++++++++++++++-
> 2 files changed, 30 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/platform_device.h b/include/linux/platform_device.h
> index 4381c34af7e0..f3926be7582f 100644
> --- a/include/linux/platform_device.h
> +++ b/include/linux/platform_device.h
> @@ -210,6 +210,7 @@ struct platform_driver {
> struct device_driver driver;
> const struct platform_device_id *id_table;
> bool prevent_deferred_probe;
> + bool suppress_auto_claim_dma_owner;
We now have "prevent_" and "suppress_" as prefixes. Why the difference?
What is wrong with "prevent_" for your new flag?
thanks,
greg k-h
On Mon, Dec 06, 2021 at 09:59:03AM +0800, Lu Baolu wrote:
> @@ -941,48 +944,44 @@ int host1x_client_iommu_attach(struct host1x_client *client)
> * not the shared IOMMU domain, don't try to attach it to a different
> * domain. This allows using the IOMMU-backed DMA API.
> */
> - if (domain && domain != tegra->domain)
> + client->group = NULL;
> + if (!client->dev->iommu_group || (domain && domain != tegra->domain))
> + return iommu_device_set_dma_owner(client->dev,
> + DMA_OWNER_DMA_API, NULL);
> +
> + if (!tegra->domain)
> return 0;
This if should be removed completely now
Jason
On Mon, Dec 06, 2021 at 09:58:46AM +0800, Lu Baolu wrote:
> >From the perspective of who is initiating the device to do DMA, device
> DMA could be divided into the following types:
>
> DMA_OWNER_DMA_API: Device DMAs are initiated by a kernel driver
> through the kernel DMA API.
> DMA_OWNER_PRIVATE_DOMAIN: Device DMAs are initiated by a kernel
> driver with its own PRIVATE domain.
> DMA_OWNER_PRIVATE_DOMAIN_USER: Device DMAs are initiated by
> userspace.
I have looked at the other iommu patches in this series, but I still
don't quite get what the difference in the code flow is between
DMA_OWNER_PRIVATE_DOMAIN and DMA_OWNER_PRIVATE_DOMAIN_USER. What are the
differences in the iommu core behavior based on this setting?
> int iommu_device_set_dma_owner(struct device *dev,
> enum iommu_dma_owner type, void *owner_cookie);
> void iommu_device_release_dma_owner(struct device *dev,
> enum iommu_dma_owner type);
It the owner is a group-wide setting, it should be called with the group
instead of the device. I have seen the group-specific funcitons are
added later, but that leaves the question why the device-specific ones
are needed at all.
> + enum iommu_dma_owner dma_owner;
> + refcount_t owner_cnt;
> + void *owner_cookie;
> };
I am also not quite happy yet with calling this dma_owner, but can't
come up with a better name yet.
>
> struct group_device {
> @@ -621,6 +624,7 @@ struct iommu_group *iommu_group_alloc(void)
> INIT_LIST_HEAD(&group->devices);
> INIT_LIST_HEAD(&group->entry);
> BLOCKING_INIT_NOTIFIER_HEAD(&group->notifier);
> + group->dma_owner = DMA_OWNER_NONE;
DMA_OWNER_NONE is also questionable. All devices are always in one
domain, and the default domain is always the one used for DMA-API, so
why isn't the initial value DMA_OWNER_DMA_API?
Regards,
Joerg
On Mon, Dec 06, 2021 at 08:53:07AM +0100, Greg Kroah-Hartman wrote:
> On Mon, Dec 06, 2021 at 09:58:48AM +0800, Lu Baolu wrote:
> > The platform_dma_configure() is shared between platform and amba bus
> > drivers. Rename the common helper to firmware_dma_configure() so that
> > both platform and amba bus drivers could customize their dma_configure
> > callbacks.
>
> Please, if you are going to call these functions "firmware_" then move
> them to the drivers/firmware/ location, they do not belong in
> drivers/base/platform.c anymore, right?
firmware seems rather misnamed anyway, amba doesn't reall have anything
to do with "firmware".
On Mon, Dec 06, 2021 at 02:35:55PM +0100, Joerg Roedel wrote:
> > enum iommu_dma_owner type, void *owner_cookie);
> > void iommu_device_release_dma_owner(struct device *dev,
> > enum iommu_dma_owner type);
>
> It the owner is a group-wide setting, it should be called with the group
> instead of the device. I have seen the group-specific funcitons are
> added later, but that leaves the question why the device-specific ones
> are needed at all.
They aren't really. A lot of bus drivers need helpers to set/release
the dma API domain if there is an iommu group, but tegra which actually
sets a non-default value would be much better off with just open coding
them.
> > @@ -621,6 +624,7 @@ struct iommu_group *iommu_group_alloc(void)
> > INIT_LIST_HEAD(&group->devices);
> > INIT_LIST_HEAD(&group->entry);
> > BLOCKING_INIT_NOTIFIER_HEAD(&group->notifier);
> > + group->dma_owner = DMA_OWNER_NONE;
>
>
> DMA_OWNER_NONE is also questionable. All devices are always in one
> domain, and the default domain is always the one used for DMA-API, so
> why isn't the initial value DMA_OWNER_DMA_API?
The interesting part is the suppress_auto_claim_dma_owner flag, but it
might make more sense to release the dma API ownership for that rather
than requesting it if it is not set.
I really hate the amount of boilerplate code that having this in each
bus type causes.
Between that and the suggestion from Joerg I wonder if we could do the
following again:
- add new no_kernel_dma flag to struct device_driver
- set this flag for the various vfio drivers
- skip claiming the kernel dma ownership for those (or rather release
it if the suggestion from Joerg works out)
On Mon, Dec 06, 2021 at 09:58:46AM +0800, Lu Baolu wrote:
> >From the perspective of who is initiating the device to do DMA, device
> DMA could be divided into the following types:
>
> DMA_OWNER_DMA_API: Device DMAs are initiated by a kernel driver
> through the kernel DMA API.
> DMA_OWNER_PRIVATE_DOMAIN: Device DMAs are initiated by a kernel
> driver with its own PRIVATE domain.
> DMA_OWNER_PRIVATE_DOMAIN_USER: Device DMAs are initiated by
> userspace.
>
> Different DMA ownerships are exclusive for all devices in the same iommu
> group as an iommu group is the smallest granularity of device isolation
> and protection that the IOMMU subsystem can guarantee. This extends the
> iommu core to enforce this exclusion.
>
> Basically two new interfaces are provided:
>
> int iommu_device_set_dma_owner(struct device *dev,
> enum iommu_dma_owner type, void *owner_cookie);
> void iommu_device_release_dma_owner(struct device *dev,
> enum iommu_dma_owner type);
>
> Although above interfaces are per-device, DMA owner is tracked per group
> under the hood. An iommu group cannot have different dma ownership set
> at the same time. Violation of this assumption fails
> iommu_device_set_dma_owner().
>
> Kernel driver which does DMA have DMA_OWNER_DMA_API automatically set/
> released in the driver binding/unbinding process (see next patch).
>
> Kernel driver which doesn't do DMA could avoid setting the owner type.
> Device bound to such driver is considered same as a driver-less device
> which is compatible to all owner types.
>
> Userspace driver framework (e.g. vfio) should set
> DMA_OWNER_PRIVATE_DOMAIN_USER for a device before the userspace is allowed
> to access it, plus a owner cookie pointer to mark the user identity so a
> single group cannot be operated by multiple users simultaneously. Vice
> versa, the owner type should be released after the user access permission
> is withdrawn.
>
> Signed-off-by: Jason Gunthorpe <[email protected]>
> Signed-off-by: Kevin Tian <[email protected]>
> Signed-off-by: Lu Baolu <[email protected]>
> ---
> include/linux/iommu.h | 36 +++++++++++++++++
> drivers/iommu/iommu.c | 93 +++++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 129 insertions(+)
>
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index d2f3435e7d17..24676b498f38 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -162,6 +162,23 @@ enum iommu_dev_features {
> IOMMU_DEV_FEAT_IOPF,
> };
>
> +/**
> + * enum iommu_dma_owner - IOMMU DMA ownership
> + * @DMA_OWNER_NONE: No DMA ownership.
> + * @DMA_OWNER_DMA_API: Device DMAs are initiated by a kernel driver through
> + * the kernel DMA API.
> + * @DMA_OWNER_PRIVATE_DOMAIN: Device DMAs are initiated by a kernel driver
> + * which provides an UNMANAGED domain.
> + * @DMA_OWNER_PRIVATE_DOMAIN_USER: Device DMAs are initiated by userspace,
> + * kernel ensures that DMAs never go to kernel memory.
> + */
> +enum iommu_dma_owner {
> + DMA_OWNER_NONE,
> + DMA_OWNER_DMA_API,
> + DMA_OWNER_PRIVATE_DOMAIN,
> + DMA_OWNER_PRIVATE_DOMAIN_USER,
> +};
> +
> #define IOMMU_PASID_INVALID (-1U)
>
> #ifdef CONFIG_IOMMU_API
> @@ -681,6 +698,10 @@ struct iommu_sva *iommu_sva_bind_device(struct device *dev,
> void iommu_sva_unbind_device(struct iommu_sva *handle);
> u32 iommu_sva_get_pasid(struct iommu_sva *handle);
>
> +int iommu_device_set_dma_owner(struct device *dev, enum iommu_dma_owner owner,
> + void *owner_cookie);
> +void iommu_device_release_dma_owner(struct device *dev, enum iommu_dma_owner owner);
> +
> #else /* CONFIG_IOMMU_API */
>
> struct iommu_ops {};
> @@ -1081,6 +1102,21 @@ static inline struct iommu_fwspec *dev_iommu_fwspec_get(struct device *dev)
> {
> return NULL;
> }
> +
> +static inline int iommu_device_set_dma_owner(struct device *dev,
> + enum iommu_dma_owner owner,
> + void *owner_cookie)
> +{
> + if (owner != DMA_OWNER_DMA_API)
> + return -EINVAL;
> +
> + return 0;
> +}
> +
> +static inline void iommu_device_release_dma_owner(struct device *dev,
> + enum iommu_dma_owner owner)
> +{
> +}
> #endif /* CONFIG_IOMMU_API */
>
> /**
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index 8b86406b7162..1de520a07518 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -48,6 +48,9 @@ struct iommu_group {
> struct iommu_domain *default_domain;
> struct iommu_domain *domain;
> struct list_head entry;
> + enum iommu_dma_owner dma_owner;
> + refcount_t owner_cnt;
owner_cnt is only manipulated under group->mutex, not need for a
refcount_t here, a plain unsigned int while do it and will also
simplify a fair bit of code as it avoid the need for atomic add/sub
and test operations.
> +static int __iommu_group_set_dma_owner(struct iommu_group *group,
> + enum iommu_dma_owner owner,
> + void *owner_cookie)
> +{
As pointed out last time, please move the group->mutex locking into
this helper, which makes it identical to the later added public
function.
> +static void __iommu_group_release_dma_owner(struct iommu_group *group,
> + enum iommu_dma_owner owner)
> +{
Same here.
On Mon, Dec 06, 2021 at 06:13:01AM -0800, Christoph Hellwig wrote:
> On Mon, Dec 06, 2021 at 08:53:07AM +0100, Greg Kroah-Hartman wrote:
> > On Mon, Dec 06, 2021 at 09:58:48AM +0800, Lu Baolu wrote:
> > > The platform_dma_configure() is shared between platform and amba bus
> > > drivers. Rename the common helper to firmware_dma_configure() so that
> > > both platform and amba bus drivers could customize their dma_configure
> > > callbacks.
> >
> > Please, if you are going to call these functions "firmware_" then move
> > them to the drivers/firmware/ location, they do not belong in
> > drivers/base/platform.c anymore, right?
>
> firmware seems rather misnamed anyway, amba doesn't reall have anything
> to do with "firmware".
Then the name is not a good one and should be called something else :)
The new helpers really could use a kerneldoc comment.
Also no need for the refcount_t with it's atomic operations for
attach_cnt either.
On Mon, Dec 06, 2021 at 06:13:01AM -0800, Christoph Hellwig wrote:
> On Mon, Dec 06, 2021 at 08:53:07AM +0100, Greg Kroah-Hartman wrote:
> > On Mon, Dec 06, 2021 at 09:58:48AM +0800, Lu Baolu wrote:
> > > The platform_dma_configure() is shared between platform and amba bus
> > > drivers. Rename the common helper to firmware_dma_configure() so that
> > > both platform and amba bus drivers could customize their dma_configure
> > > callbacks.
> >
> > Please, if you are going to call these functions "firmware_" then move
> > them to the drivers/firmware/ location, they do not belong in
> > drivers/base/platform.c anymore, right?
>
> firmware seems rather misnamed anyway, amba doesn't reall have anything
> to do with "firmware".
IIRC the only thing this function does is touch ACPI and OF stuff?
Isn't that firmware?
AFAICT amba uses this because AMBA devices might be linked to DT
descriptions?
Jason
On Mon, Dec 06, 2021 at 10:45:35AM -0400, Jason Gunthorpe via iommu wrote:
> IIRC the only thing this function does is touch ACPI and OF stuff?
> Isn't that firmware?
>
> AFAICT amba uses this because AMBA devices might be linked to DT
> descriptions?
But DT descriptions aren't firmware. They are usually either passed onb
the bootloader or in some deeply embedded setups embedded into the
kernel image.
On Mon, Dec 06, 2021 at 02:35:55PM +0100, Joerg Roedel wrote:
> On Mon, Dec 06, 2021 at 09:58:46AM +0800, Lu Baolu wrote:
> > >From the perspective of who is initiating the device to do DMA, device
> > DMA could be divided into the following types:
> >
> > DMA_OWNER_DMA_API: Device DMAs are initiated by a kernel driver
> > through the kernel DMA API.
> > DMA_OWNER_PRIVATE_DOMAIN: Device DMAs are initiated by a kernel
> > driver with its own PRIVATE domain.
> > DMA_OWNER_PRIVATE_DOMAIN_USER: Device DMAs are initiated by
> > userspace.
>
> I have looked at the other iommu patches in this series, but I still
> don't quite get what the difference in the code flow is between
> DMA_OWNER_PRIVATE_DOMAIN and DMA_OWNER_PRIVATE_DOMAIN_USER. What are the
> differences in the iommu core behavior based on this setting?
USER causes the IOMMU code to spend extra work to never assign the
default domain. Lu, it would be good to update the comment with this
detail
Once in USER mode the domain is always a /dev/null domain or a domain
controlled by userspace. Never a domain pointing at kernel memory.
> > int iommu_device_set_dma_owner(struct device *dev,
> > enum iommu_dma_owner type, void *owner_cookie);
> > void iommu_device_release_dma_owner(struct device *dev,
> > enum iommu_dma_owner type);
>
> It the owner is a group-wide setting, it should be called with the group
> instead of the device. I have seen the group-specific funcitons are
> added later, but that leaves the question why the device-specific ones
> are needed at all.
We should not be exposing group interfaces to drivers. Drivers are
device centric, they have struct devices, they should not be touching
the group. Figuring out how to relate a device to a group is the job
of the IOMMU code.
This series deletes the only use of the group interface from normal
drivers (tegra)
The device interfaces are the primary interface, the group interface
was added only to support VFIO and only because VFIO has made the
group part of it's uAPI.
> > struct group_device {
> > @@ -621,6 +624,7 @@ struct iommu_group *iommu_group_alloc(void)
> > INIT_LIST_HEAD(&group->devices);
> > INIT_LIST_HEAD(&group->entry);
> > BLOCKING_INIT_NOTIFIER_HEAD(&group->notifier);
> > + group->dma_owner = DMA_OWNER_NONE;
>
>
> DMA_OWNER_NONE is also questionable. All devices are always in one
> domain, and the default domain is always the one used for DMA-API, so
> why isn't the initial value DMA_OWNER_DMA_API?
'NONE' means the group is in the default domain but no driver is bound
and thus DMA isn't being used. Seeing NONE is the only condition when
it is OK to change the domain.
This could be reworked to instead rely on the refcount == 0 as the
signal to know it is OK to change the domain and then we never have
NONE at all. Lu?
Jason
On Mon, Dec 06, 2021 at 06:47:45AM -0800, Christoph Hellwig wrote:
> On Mon, Dec 06, 2021 at 10:45:35AM -0400, Jason Gunthorpe via iommu wrote:
> > IIRC the only thing this function does is touch ACPI and OF stuff?
> > Isn't that firmware?
> >
> > AFAICT amba uses this because AMBA devices might be linked to DT
> > descriptions?
>
> But DT descriptions aren't firmware. They are usually either passed onb
> the bootloader or in some deeply embedded setups embedded into the
> kernel image.
Pedenatically yes, but do you know of a common word to refer to both
OF and ACPI that is better than firmware? :)
AFAICT we already use firwmare for this in a few places, eg
fwnode_handle and so on.
Jason
On Mon, Dec 06, 2021 at 06:36:27AM -0800, Christoph Hellwig wrote:
> I really hate the amount of boilerplate code that having this in each
> bus type causes.
+1
I liked the first version of this series better with the code near
really_probe().
Can we go back to that with some device_configure_dma() wrapper
condtionally called by really_probe as we discussed?
> Between that and the suggestion from Joerg I wonder if we could do the
> following again:
>
> - add new no_kernel_dma flag to struct device_driver
> - set this flag for the various vfio drivers
> - skip claiming the kernel dma ownership for those (or rather release
> it if the suggestion from Joerg works out)
v1 did exactly this.
Jason
On 12/6/21 11:04 PM, Jason Gunthorpe wrote:
> On Mon, Dec 06, 2021 at 06:47:45AM -0800, Christoph Hellwig wrote:
>> On Mon, Dec 06, 2021 at 10:45:35AM -0400, Jason Gunthorpe via iommu wrote:
>>> IIRC the only thing this function does is touch ACPI and OF stuff?
>>> Isn't that firmware?
>>>
>>> AFAICT amba uses this because AMBA devices might be linked to DT
>>> descriptions?
>> But DT descriptions aren't firmware. They are usually either passed onb
>> the bootloader or in some deeply embedded setups embedded into the
>> kernel image.
> Pedenatically yes, but do you know of a common word to refer to both
> OF and ACPI that is better than firmware?:)
If the firmware_ name is confusing, how about common_dma_configure()?
Or, copy the 6 lines of code to amba bus driver?
Best regards,
baolu
On 12/6/21 11:01 PM, Jason Gunthorpe wrote:
> On Mon, Dec 06, 2021 at 02:35:55PM +0100, Joerg Roedel wrote:
>> On Mon, Dec 06, 2021 at 09:58:46AM +0800, Lu Baolu wrote:
>>> >From the perspective of who is initiating the device to do DMA, device
>>> DMA could be divided into the following types:
>>>
>>> DMA_OWNER_DMA_API: Device DMAs are initiated by a kernel driver
>>> through the kernel DMA API.
>>> DMA_OWNER_PRIVATE_DOMAIN: Device DMAs are initiated by a kernel
>>> driver with its own PRIVATE domain.
>>> DMA_OWNER_PRIVATE_DOMAIN_USER: Device DMAs are initiated by
>>> userspace.
>>
>> I have looked at the other iommu patches in this series, but I still
>> don't quite get what the difference in the code flow is between
>> DMA_OWNER_PRIVATE_DOMAIN and DMA_OWNER_PRIVATE_DOMAIN_USER. What are the
>> differences in the iommu core behavior based on this setting?
>
> USER causes the IOMMU code to spend extra work to never assign the
> default domain. Lu, it would be good to update the comment with this
> detail
>
> Once in USER mode the domain is always a /dev/null domain or a domain
> controlled by userspace. Never a domain pointing at kernel memory.
Yes. The __iommu_detach_group() re-attaches the default domain
automatically. This is not allowed once in USER mode.
I will update the comments whit this detail.
>
>>> struct group_device {
>>> @@ -621,6 +624,7 @@ struct iommu_group *iommu_group_alloc(void)
>>> INIT_LIST_HEAD(&group->devices);
>>> INIT_LIST_HEAD(&group->entry);
>>> BLOCKING_INIT_NOTIFIER_HEAD(&group->notifier);
>>> + group->dma_owner = DMA_OWNER_NONE;
>>
>>
>> DMA_OWNER_NONE is also questionable. All devices are always in one
>> domain, and the default domain is always the one used for DMA-API, so
>> why isn't the initial value DMA_OWNER_DMA_API?
>
> 'NONE' means the group is in the default domain but no driver is bound
> and thus DMA isn't being used. Seeing NONE is the only condition when
> it is OK to change the domain.
>
> This could be reworked to instead rely on the refcount == 0 as the
> signal to know it is OK to change the domain and then we never have
> NONE at all. Lu?
NONE is just a parking state. It's okay to rely on the "refcount == 0"
for state transition as far as I see. I will work towards this.
Best regards,
baolu
On 12/6/21 10:42 PM, Christoph Hellwig wrote:
> On Mon, Dec 06, 2021 at 09:58:46AM +0800, Lu Baolu wrote:
>> >From the perspective of who is initiating the device to do DMA, device
>> DMA could be divided into the following types:
>>
>> DMA_OWNER_DMA_API: Device DMAs are initiated by a kernel driver
>> through the kernel DMA API.
>> DMA_OWNER_PRIVATE_DOMAIN: Device DMAs are initiated by a kernel
>> driver with its own PRIVATE domain.
>> DMA_OWNER_PRIVATE_DOMAIN_USER: Device DMAs are initiated by
>> userspace.
>>
>> Different DMA ownerships are exclusive for all devices in the same iommu
>> group as an iommu group is the smallest granularity of device isolation
>> and protection that the IOMMU subsystem can guarantee. This extends the
>> iommu core to enforce this exclusion.
>>
>> Basically two new interfaces are provided:
>>
>> int iommu_device_set_dma_owner(struct device *dev,
>> enum iommu_dma_owner type, void *owner_cookie);
>> void iommu_device_release_dma_owner(struct device *dev,
>> enum iommu_dma_owner type);
>>
>> Although above interfaces are per-device, DMA owner is tracked per group
>> under the hood. An iommu group cannot have different dma ownership set
>> at the same time. Violation of this assumption fails
>> iommu_device_set_dma_owner().
>>
>> Kernel driver which does DMA have DMA_OWNER_DMA_API automatically set/
>> released in the driver binding/unbinding process (see next patch).
>>
>> Kernel driver which doesn't do DMA could avoid setting the owner type.
>> Device bound to such driver is considered same as a driver-less device
>> which is compatible to all owner types.
>>
>> Userspace driver framework (e.g. vfio) should set
>> DMA_OWNER_PRIVATE_DOMAIN_USER for a device before the userspace is allowed
>> to access it, plus a owner cookie pointer to mark the user identity so a
>> single group cannot be operated by multiple users simultaneously. Vice
>> versa, the owner type should be released after the user access permission
>> is withdrawn.
>>
>> Signed-off-by: Jason Gunthorpe <[email protected]>
>> Signed-off-by: Kevin Tian <[email protected]>
>> Signed-off-by: Lu Baolu <[email protected]>
>> ---
>> include/linux/iommu.h | 36 +++++++++++++++++
>> drivers/iommu/iommu.c | 93 +++++++++++++++++++++++++++++++++++++++++++
>> 2 files changed, 129 insertions(+)
>>
>> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
>> index d2f3435e7d17..24676b498f38 100644
>> --- a/include/linux/iommu.h
>> +++ b/include/linux/iommu.h
>> @@ -162,6 +162,23 @@ enum iommu_dev_features {
>> IOMMU_DEV_FEAT_IOPF,
>> };
>>
>> +/**
>> + * enum iommu_dma_owner - IOMMU DMA ownership
>> + * @DMA_OWNER_NONE: No DMA ownership.
>> + * @DMA_OWNER_DMA_API: Device DMAs are initiated by a kernel driver through
>> + * the kernel DMA API.
>> + * @DMA_OWNER_PRIVATE_DOMAIN: Device DMAs are initiated by a kernel driver
>> + * which provides an UNMANAGED domain.
>> + * @DMA_OWNER_PRIVATE_DOMAIN_USER: Device DMAs are initiated by userspace,
>> + * kernel ensures that DMAs never go to kernel memory.
>> + */
>> +enum iommu_dma_owner {
>> + DMA_OWNER_NONE,
>> + DMA_OWNER_DMA_API,
>> + DMA_OWNER_PRIVATE_DOMAIN,
>> + DMA_OWNER_PRIVATE_DOMAIN_USER,
>> +};
>> +
>> #define IOMMU_PASID_INVALID (-1U)
>>
>> #ifdef CONFIG_IOMMU_API
>> @@ -681,6 +698,10 @@ struct iommu_sva *iommu_sva_bind_device(struct device *dev,
>> void iommu_sva_unbind_device(struct iommu_sva *handle);
>> u32 iommu_sva_get_pasid(struct iommu_sva *handle);
>>
>> +int iommu_device_set_dma_owner(struct device *dev, enum iommu_dma_owner owner,
>> + void *owner_cookie);
>> +void iommu_device_release_dma_owner(struct device *dev, enum iommu_dma_owner owner);
>> +
>> #else /* CONFIG_IOMMU_API */
>>
>> struct iommu_ops {};
>> @@ -1081,6 +1102,21 @@ static inline struct iommu_fwspec *dev_iommu_fwspec_get(struct device *dev)
>> {
>> return NULL;
>> }
>> +
>> +static inline int iommu_device_set_dma_owner(struct device *dev,
>> + enum iommu_dma_owner owner,
>> + void *owner_cookie)
>> +{
>> + if (owner != DMA_OWNER_DMA_API)
>> + return -EINVAL;
>> +
>> + return 0;
>> +}
>> +
>> +static inline void iommu_device_release_dma_owner(struct device *dev,
>> + enum iommu_dma_owner owner)
>> +{
>> +}
>> #endif /* CONFIG_IOMMU_API */
>>
>> /**
>> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
>> index 8b86406b7162..1de520a07518 100644
>> --- a/drivers/iommu/iommu.c
>> +++ b/drivers/iommu/iommu.c
>> @@ -48,6 +48,9 @@ struct iommu_group {
>> struct iommu_domain *default_domain;
>> struct iommu_domain *domain;
>> struct list_head entry;
>> + enum iommu_dma_owner dma_owner;
>> + refcount_t owner_cnt;
>
> owner_cnt is only manipulated under group->mutex, not need for a
> refcount_t here, a plain unsigned int while do it and will also
> simplify a fair bit of code as it avoid the need for atomic add/sub
> and test operations.
Fair enough.
>
>> +static int __iommu_group_set_dma_owner(struct iommu_group *group,
>> + enum iommu_dma_owner owner,
>> + void *owner_cookie)
>> +{
>
> As pointed out last time, please move the group->mutex locking into
> this helper, which makes it identical to the later added public
> function.
I didn't mean to ignore your comment. :-) As I replied, by placing the
lock out of the function, the helper could easily handle the error paths
(return directly without something like "goto out_unlock").
As the implementation of iommu_group_set_dma_owner() has been greatly
simplified, I agree with you now, we should move the group->mutex
locking into the helper and make it identical to the latter public
interface.
I will work towards this.
>
>> +static void __iommu_group_release_dma_owner(struct iommu_group *group,
>> + enum iommu_dma_owner owner)
>> +{
>
> Same here.
>
Ditto.
Best regards,
baolu
On 12/6/21 10:43 PM, Christoph Hellwig wrote:
> The new helpers really could use a kerneldoc comment.
> Also no need for the refcount_t with it's atomic operations for
> attach_cnt either.
>
Yes. Agreed.
Best regards,
baolu
On 12/6/21 8:40 PM, Jason Gunthorpe wrote:
> On Mon, Dec 06, 2021 at 09:59:03AM +0800, Lu Baolu wrote:
>
>> @@ -941,48 +944,44 @@ int host1x_client_iommu_attach(struct host1x_client *client)
>> * not the shared IOMMU domain, don't try to attach it to a different
>> * domain. This allows using the IOMMU-backed DMA API.
>> */
>> - if (domain && domain != tegra->domain)
>> + client->group = NULL;
>> + if (!client->dev->iommu_group || (domain && domain != tegra->domain))
>> + return iommu_device_set_dma_owner(client->dev,
>> + DMA_OWNER_DMA_API, NULL);
>> +
>> + if (!tegra->domain)
>> return 0;
>
> This if should be removed completely now
>
> Jason
>
Sure.
Best regards,
baolu
On 12/6/21 11:06 PM, Jason Gunthorpe wrote:
> On Mon, Dec 06, 2021 at 06:36:27AM -0800, Christoph Hellwig wrote:
>> I really hate the amount of boilerplate code that having this in each
>> bus type causes.
> +1
>
> I liked the first version of this series better with the code near
> really_probe().
>
> Can we go back to that with some device_configure_dma() wrapper
> condtionally called by really_probe as we discussed?
>
Are you talking about below change?
diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index 68ea1f949daa..368f9e530515 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -577,7 +577,13 @@ static int really_probe(struct device *dev, struct
device_driver *drv)
if (dev->bus->dma_configure) {
ret = dev->bus->dma_configure(dev);
if (ret)
- goto probe_failed;
+ goto pinctrl_bind_failed;
+
+ if (!drv->no_kernel_dma) {
+ ret = iommu_device_set_dma_owner(dev, DMA_OWNER_DMA_API, NULL);
+ if (ret)
+ goto pinctrl_bind_failed;
+ }
}
ret = driver_sysfs_add(dev);
@@ -660,6 +666,9 @@ static int really_probe(struct device *dev, struct
device_driver *drv)
if (dev->bus)
blocking_notifier_call_chain(&dev->bus->p->bus_notifier,
BUS_NOTIFY_DRIVER_NOT_BOUND, dev);
+
+ if (dev->bus->dma_configure && !drv->no_kernel_dma)
+ iommu_device_release_dma_owner(dev, DMA_OWNER_DMA_API);
pinctrl_bind_failed:
device_links_no_driver(dev);
devres_release_all(dev);
@@ -1204,6 +1213,9 @@ static void __device_release_driver(struct device
*dev, struct device *parent)
else if (drv->remove)
drv->remove(dev);
+ if (dev->bus->dma_configure && !drv->no_kernel_dma)
+ iommu_device_release_dma_owner(dev, DMA_OWNER_DMA_API);
+
device_links_driver_cleanup(dev);
devres_release_all(dev);
diff --git a/include/linux/device/driver.h b/include/linux/device/driver.h
index a498ebcf4993..2cf7b757b28e 100644
--- a/include/linux/device/driver.h
+++ b/include/linux/device/driver.h
@@ -100,6 +100,7 @@ struct device_driver {
const char *mod_name; /* used for built-in modules */
bool suppress_bind_attrs; /* disables bind/unbind via sysfs */
+ bool no_kernel_dma;
enum probe_type probe_type;
const struct of_device_id *of_match_table;
Best regards,
baolu
On Tue, Dec 07, 2021 at 10:57:25AM +0800, Lu Baolu wrote:
> On 12/6/21 11:06 PM, Jason Gunthorpe wrote:
> > On Mon, Dec 06, 2021 at 06:36:27AM -0800, Christoph Hellwig wrote:
> > > I really hate the amount of boilerplate code that having this in each
> > > bus type causes.
> > +1
> >
> > I liked the first version of this series better with the code near
> > really_probe().
> >
> > Can we go back to that with some device_configure_dma() wrapper
> > condtionally called by really_probe as we discussed?
> >
>
> Are you talking about below change?
>
> diff --git a/drivers/base/dd.c b/drivers/base/dd.c
> index 68ea1f949daa..368f9e530515 100644
> +++ b/drivers/base/dd.c
> @@ -577,7 +577,13 @@ static int really_probe(struct device *dev, struct
> device_driver *drv)
> if (dev->bus->dma_configure) {
> ret = dev->bus->dma_configure(dev);
> if (ret)
> - goto probe_failed;
> + goto pinctrl_bind_failed;
> +
> + if (!drv->no_kernel_dma) {
> + ret = iommu_device_set_dma_owner(dev, DMA_OWNER_DMA_API, NULL);
> + if (ret)
> + goto pinctrl_bind_failed;
> + }
> }
Yes, the suggestion was to put everything that 'if' inside a function
and then of course a matching undo function.
Jason
On Tue, Dec 07, 2021 at 09:16:27AM -0400, Jason Gunthorpe wrote:
> Yes, the suggestion was to put everything that 'if' inside a function
> and then of course a matching undo function.
Can't we simplify things even more? Do away with the DMA API owner
entirely, and instead in iommu_group_set_dma_owner iterate over all
devices in a group and check that they all have the no_dma_api flag
set (plus a similar check on group join). With that most of the
boilerplate code goes away entirely in favor of a little more work at
iommu_group_set_dma_owner time.
On Tue, Dec 07, 2021 at 05:25:04AM -0800, Christoph Hellwig wrote:
> On Tue, Dec 07, 2021 at 09:16:27AM -0400, Jason Gunthorpe wrote:
> > Yes, the suggestion was to put everything that 'if' inside a function
> > and then of course a matching undo function.
>
> Can't we simplify things even more? Do away with the DMA API owner
> entirely, and instead in iommu_group_set_dma_owner iterate over all
> devices in a group and check that they all have the no_dma_api flag
> set (plus a similar check on group join). With that most of the
> boilerplate code goes away entirely in favor of a little more work at
> iommu_group_set_dma_owner time.
Robin suggested something like this already.
The locking doesn't work out, we can't nest device_lock()'s safely
without ABBA deadlocks, and can't touch the dev->driver without the
device_lock.
Jason
On Mon, Dec 6, 2021 at 7:04 AM Jason Gunthorpe <[email protected]> wrote:
>
> On Mon, Dec 06, 2021 at 06:47:45AM -0800, Christoph Hellwig wrote:
> > On Mon, Dec 06, 2021 at 10:45:35AM -0400, Jason Gunthorpe via iommu wrote:
> > > IIRC the only thing this function does is touch ACPI and OF stuff?
> > > Isn't that firmware?
> > >
> > > AFAICT amba uses this because AMBA devices might be linked to DT
> > > descriptions?
> >
> > But DT descriptions aren't firmware. They are usually either passed onb
> > the bootloader or in some deeply embedded setups embedded into the
> > kernel image.
>
> Pedenatically yes, but do you know of a common word to refer to both
> OF and ACPI that is better than firmware? :)
>
> AFAICT we already use firwmare for this in a few places, eg
> fwnode_handle and so on.
I've always thought 'platform' was the generic name for otherwise
non-enumerable platform-firmware/data things enumerated by ACPI / OF.
On 12/7/21 9:16 PM, Jason Gunthorpe wrote:
> On Tue, Dec 07, 2021 at 10:57:25AM +0800, Lu Baolu wrote:
>> On 12/6/21 11:06 PM, Jason Gunthorpe wrote:
>>> On Mon, Dec 06, 2021 at 06:36:27AM -0800, Christoph Hellwig wrote:
>>>> I really hate the amount of boilerplate code that having this in each
>>>> bus type causes.
>>> +1
>>>
>>> I liked the first version of this series better with the code near
>>> really_probe().
>>>
>>> Can we go back to that with some device_configure_dma() wrapper
>>> condtionally called by really_probe as we discussed?
>>>
>>
>> Are you talking about below change?
>>
>> diff --git a/drivers/base/dd.c b/drivers/base/dd.c
>> index 68ea1f949daa..368f9e530515 100644
>> +++ b/drivers/base/dd.c
>> @@ -577,7 +577,13 @@ static int really_probe(struct device *dev, struct
>> device_driver *drv)
>> if (dev->bus->dma_configure) {
>> ret = dev->bus->dma_configure(dev);
>> if (ret)
>> - goto probe_failed;
>> + goto pinctrl_bind_failed;
>> +
>> + if (!drv->no_kernel_dma) {
>> + ret = iommu_device_set_dma_owner(dev, DMA_OWNER_DMA_API, NULL);
>> + if (ret)
>> + goto pinctrl_bind_failed;
>> + }
>> }
>
> Yes, the suggestion was to put everything that 'if' inside a function
> and then of course a matching undo function.
Followed your suggestion, I refactored the change like below:
diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index 68ea1f949daa..68ca5a579eb1 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -538,6 +538,32 @@ static int call_driver_probe(struct device *dev,
struct device_driver *drv)
return ret;
}
+static int device_dma_configure(struct device *dev, struct
device_driver *drv)
+{
+ int ret;
+
+ if (!dev->bus->dma_configure)
+ return 0;
+
+ ret = dev->bus->dma_configure(dev);
+ if (ret)
+ return ret;
+
+ if (!drv->suppress_auto_claim_dma_owner)
+ ret = iommu_device_set_dma_owner(dev, DMA_OWNER_DMA_API,
NULL);
+
+ return ret;
+}
+
+static void device_dma_cleanup(struct device *dev, struct device_driver
*drv)
+{
+ if (!dev->bus->dma_configure)
+ return;
+
+ if (!drv->suppress_auto_claim_dma_owner)
+ iommu_device_release_dma_owner(dev, DMA_OWNER_DMA_API,
NULL);
+}
+
static int really_probe(struct device *dev, struct device_driver *drv)
{
bool test_remove = IS_ENABLED(CONFIG_DEBUG_TEST_DRIVER_REMOVE) &&
@@ -574,11 +600,8 @@ static int really_probe(struct device *dev, struct
device_driver *drv)
if (ret)
goto pinctrl_bind_failed;
- if (dev->bus->dma_configure) {
- ret = dev->bus->dma_configure(dev);
- if (ret)
- goto probe_failed;
- }
+ if (device_dma_configure(dev, drv))
+ goto pinctrl_bind_failed;
ret = driver_sysfs_add(dev);
if (ret) {
@@ -660,6 +683,8 @@ static int really_probe(struct device *dev, struct
device_driver *drv)
if (dev->bus)
blocking_notifier_call_chain(&dev->bus->p->bus_notifier,
BUS_NOTIFY_DRIVER_NOT_BOUND, dev);
+
+ device_dma_cleanup(dev, drv);
pinctrl_bind_failed:
device_links_no_driver(dev);
devres_release_all(dev);
@@ -1204,6 +1229,7 @@ static void __device_release_driver(struct device
*dev, struct device *parent)
else if (drv->remove)
drv->remove(dev);
+ device_dma_cleanup(dev, drv);
device_links_driver_cleanup(dev);
devres_release_all(dev);
diff --git a/include/linux/device/driver.h b/include/linux/device/driver.h
index a498ebcf4993..374a3c2cc10d 100644
--- a/include/linux/device/driver.h
+++ b/include/linux/device/driver.h
@@ -100,6 +100,7 @@ struct device_driver {
const char *mod_name; /* used for built-in
modules */
bool suppress_bind_attrs; /* disables bind/unbind via
sysfs */
+ bool suppress_auto_claim_dma_owner;
enum probe_type probe_type;
const struct of_device_id *of_match_table;
Further suggestions?
Best regards,
baolu
Hi Greg, Jason and Christoph,
On 12/9/21 9:20 AM, Lu Baolu wrote:
> On 12/7/21 9:16 PM, Jason Gunthorpe wrote:
>> On Tue, Dec 07, 2021 at 10:57:25AM +0800, Lu Baolu wrote:
>>> On 12/6/21 11:06 PM, Jason Gunthorpe wrote:
>>>> On Mon, Dec 06, 2021 at 06:36:27AM -0800, Christoph Hellwig wrote:
>>>>> I really hate the amount of boilerplate code that having this in each
>>>>> bus type causes.
>>>> +1
>>>>
>>>> I liked the first version of this series better with the code near
>>>> really_probe().
>>>>
>>>> Can we go back to that with some device_configure_dma() wrapper
>>>> condtionally called by really_probe as we discussed?
[...]
>
> diff --git a/drivers/base/dd.c b/drivers/base/dd.c
> index 68ea1f949daa..68ca5a579eb1 100644
> --- a/drivers/base/dd.c
> +++ b/drivers/base/dd.c
> @@ -538,6 +538,32 @@ static int call_driver_probe(struct device *dev,
> struct device_driver *drv)
> return ret;
> }
>
> +static int device_dma_configure(struct device *dev, struct
> device_driver *drv)
> +{
> + int ret;
> +
> + if (!dev->bus->dma_configure)
> + return 0;
> +
> + ret = dev->bus->dma_configure(dev);
> + if (ret)
> + return ret;
> +
> + if (!drv->suppress_auto_claim_dma_owner)
> + ret = iommu_device_set_dma_owner(dev, DMA_OWNER_DMA_API,
> NULL);
> +
> + return ret;
> +}
> +
> +static void device_dma_cleanup(struct device *dev, struct device_driver
> *drv)
> +{
> + if (!dev->bus->dma_configure)
> + return;
> +
> + if (!drv->suppress_auto_claim_dma_owner)
> + iommu_device_release_dma_owner(dev, DMA_OWNER_DMA_API,
> NULL);
> +}
> +
> static int really_probe(struct device *dev, struct device_driver *drv)
> {
> bool test_remove = IS_ENABLED(CONFIG_DEBUG_TEST_DRIVER_REMOVE) &&
> @@ -574,11 +600,8 @@ static int really_probe(struct device *dev, struct
> device_driver *drv)
> if (ret)
> goto pinctrl_bind_failed;
>
> - if (dev->bus->dma_configure) {
> - ret = dev->bus->dma_configure(dev);
> - if (ret)
> - goto probe_failed;
> - }
> + if (device_dma_configure(dev, drv))
> + goto pinctrl_bind_failed;
>
> ret = driver_sysfs_add(dev);
> if (ret) {
> @@ -660,6 +683,8 @@ static int really_probe(struct device *dev, struct
> device_driver *drv)
> if (dev->bus)
> blocking_notifier_call_chain(&dev->bus->p->bus_notifier,
>
> BUS_NOTIFY_DRIVER_NOT_BOUND, dev);
> +
> + device_dma_cleanup(dev, drv);
> pinctrl_bind_failed:
> device_links_no_driver(dev);
> devres_release_all(dev);
> @@ -1204,6 +1229,7 @@ static void __device_release_driver(struct device
> *dev, struct device *parent)
> else if (drv->remove)
> drv->remove(dev);
>
> + device_dma_cleanup(dev, drv);
> device_links_driver_cleanup(dev);
>
> devres_release_all(dev);
> diff --git a/include/linux/device/driver.h b/include/linux/device/driver.h
> index a498ebcf4993..374a3c2cc10d 100644
> --- a/include/linux/device/driver.h
> +++ b/include/linux/device/driver.h
> @@ -100,6 +100,7 @@ struct device_driver {
> const char *mod_name; /* used for built-in
> modules */
>
> bool suppress_bind_attrs; /* disables bind/unbind via
> sysfs */
> + bool suppress_auto_claim_dma_owner;
> enum probe_type probe_type;
>
> const struct of_device_id *of_match_table;
Does this work for you? Can I work towards this in the next version?
Best regards,
baolu
On 12/10/21 9:23 AM, Lu Baolu wrote:
> Hi Greg, Jason and Christoph,
>
> On 12/9/21 9:20 AM, Lu Baolu wrote:
>> On 12/7/21 9:16 PM, Jason Gunthorpe wrote:
>>> On Tue, Dec 07, 2021 at 10:57:25AM +0800, Lu Baolu wrote:
>>>> On 12/6/21 11:06 PM, Jason Gunthorpe wrote:
>>>>> On Mon, Dec 06, 2021 at 06:36:27AM -0800, Christoph Hellwig wrote:
>>>>>> I really hate the amount of boilerplate code that having this in each
>>>>>> bus type causes.
>>>>> +1
>>>>>
>>>>> I liked the first version of this series better with the code near
>>>>> really_probe().
>>>>>
>>>>> Can we go back to that with some device_configure_dma() wrapper
>>>>> condtionally called by really_probe as we discussed?
>
> [...]
>
>>
>> diff --git a/drivers/base/dd.c b/drivers/base/dd.c
>> index 68ea1f949daa..68ca5a579eb1 100644
>> --- a/drivers/base/dd.c
>> +++ b/drivers/base/dd.c
>> @@ -538,6 +538,32 @@ static int call_driver_probe(struct device *dev,
>> struct device_driver *drv)
>> return ret;
>> }
>>
>> +static int device_dma_configure(struct device *dev, struct
>> device_driver *drv)
>> +{
>> + int ret;
>> +
>> + if (!dev->bus->dma_configure)
>> + return 0;
>> +
>> + ret = dev->bus->dma_configure(dev);
>> + if (ret)
>> + return ret;
>> +
>> + if (!drv->suppress_auto_claim_dma_owner)
>> + ret = iommu_device_set_dma_owner(dev,
>> DMA_OWNER_DMA_API, NULL);
>> +
>> + return ret;
>> +}
>> +
>> +static void device_dma_cleanup(struct device *dev, struct
>> device_driver *drv)
>> +{
>> + if (!dev->bus->dma_configure)
>> + return;
>> +
>> + if (!drv->suppress_auto_claim_dma_owner)
>> + iommu_device_release_dma_owner(dev, DMA_OWNER_DMA_API);
>> +}
>> +
>> static int really_probe(struct device *dev, struct device_driver *drv)
>> {
>> bool test_remove =
>> IS_ENABLED(CONFIG_DEBUG_TEST_DRIVER_REMOVE) &&
>> @@ -574,11 +600,8 @@ static int really_probe(struct device *dev,
>> struct device_driver *drv)
>> if (ret)
>> goto pinctrl_bind_failed;
>>
>> - if (dev->bus->dma_configure) {
>> - ret = dev->bus->dma_configure(dev);
>> - if (ret)
>> - goto probe_failed;
>> - }
>> + if (device_dma_configure(dev, drv))
>> + goto pinctrl_bind_failed;
>>
>> ret = driver_sysfs_add(dev);
>> if (ret) {
>> @@ -660,6 +683,8 @@ static int really_probe(struct device *dev, struct
>> device_driver *drv)
>> if (dev->bus)
>> blocking_notifier_call_chain(&dev->bus->p->bus_notifier,
>>
>> BUS_NOTIFY_DRIVER_NOT_BOUND, dev);
>> +
>> + device_dma_cleanup(dev, drv);
>> pinctrl_bind_failed:
>> device_links_no_driver(dev);
>> devres_release_all(dev);
>> @@ -1204,6 +1229,7 @@ static void __device_release_driver(struct
>> device *dev, struct device *parent)
>> else if (drv->remove)
>> drv->remove(dev);
>>
>> + device_dma_cleanup(dev, drv);
>> device_links_driver_cleanup(dev);
>>
>> devres_release_all(dev);
>> diff --git a/include/linux/device/driver.h
>> b/include/linux/device/driver.h
>> index a498ebcf4993..374a3c2cc10d 100644
>> --- a/include/linux/device/driver.h
>> +++ b/include/linux/device/driver.h
>> @@ -100,6 +100,7 @@ struct device_driver {
>> const char *mod_name; /* used for built-in
>> modules */
>>
>> bool suppress_bind_attrs; /* disables bind/unbind via
>> sysfs */
>> + bool suppress_auto_claim_dma_owner;
>> enum probe_type probe_type;
>>
>> const struct of_device_id *of_match_table;
>
> Does this work for you? Can I work towards this in the next version?
A kindly ping ... Is this heading the right direction? I need your
advice to move ahead. :-)
Best regards,
baolu
On Mon, Dec 13, 2021 at 08:50:05AM +0800, Lu Baolu wrote:
> > Does this work for you? Can I work towards this in the next version?
>
> A kindly ping ... Is this heading the right direction? I need your
> advice to move ahead. :-)
I prefer this to all the duplicated code in the v3 series.. Given it
touches almost every bus type that implements the dma_configure it
seems appropriate.
Jason
This approach looks much better to me.
Hi Greg,
On 2021/12/13 8:50, Lu Baolu wrote:
> On 12/10/21 9:23 AM, Lu Baolu wrote:
>> Hi Greg, Jason and Christoph,
>>
>> On 12/9/21 9:20 AM, Lu Baolu wrote:
>>> On 12/7/21 9:16 PM, Jason Gunthorpe wrote:
>>>> On Tue, Dec 07, 2021 at 10:57:25AM +0800, Lu Baolu wrote:
>>>>> On 12/6/21 11:06 PM, Jason Gunthorpe wrote:
>>>>>> On Mon, Dec 06, 2021 at 06:36:27AM -0800, Christoph Hellwig wrote:
>>>>>>> I really hate the amount of boilerplate code that having this in
>>>>>>> each
>>>>>>> bus type causes.
>>>>>> +1
>>>>>>
>>>>>> I liked the first version of this series better with the code near
>>>>>> really_probe().
>>>>>>
>>>>>> Can we go back to that with some device_configure_dma() wrapper
>>>>>> condtionally called by really_probe as we discussed?
>>
>> [...]
>>
>>>
>>> diff --git a/drivers/base/dd.c b/drivers/base/dd.c
>>> index 68ea1f949daa..68ca5a579eb1 100644
>>> --- a/drivers/base/dd.c
>>> +++ b/drivers/base/dd.c
>>> @@ -538,6 +538,32 @@ static int call_driver_probe(struct device *dev,
>>> struct device_driver *drv)
>>> return ret;
>>> }
>>>
>>> +static int device_dma_configure(struct device *dev, struct
>>> device_driver *drv)
>>> +{
>>> + int ret;
>>> +
>>> + if (!dev->bus->dma_configure)
>>> + return 0;
>>> +
>>> + ret = dev->bus->dma_configure(dev);
>>> + if (ret)
>>> + return ret;
>>> +
>>> + if (!drv->suppress_auto_claim_dma_owner)
>>> + ret = iommu_device_set_dma_owner(dev,
>>> DMA_OWNER_DMA_API, NULL);
>>> +
>>> + return ret;
>>> +}
>>> +
>>> +static void device_dma_cleanup(struct device *dev, struct
>>> device_driver *drv)
>>> +{
>>> + if (!dev->bus->dma_configure)
>>> + return;
>>> +
>>> + if (!drv->suppress_auto_claim_dma_owner)
>>> + iommu_device_release_dma_owner(dev, DMA_OWNER_DMA_API);
>>> +}
>>> +
>>> static int really_probe(struct device *dev, struct device_driver *drv)
>>> {
>>> bool test_remove =
>>> IS_ENABLED(CONFIG_DEBUG_TEST_DRIVER_REMOVE) &&
>>> @@ -574,11 +600,8 @@ static int really_probe(struct device *dev,
>>> struct device_driver *drv)
>>> if (ret)
>>> goto pinctrl_bind_failed;
>>>
>>> - if (dev->bus->dma_configure) {
>>> - ret = dev->bus->dma_configure(dev);
>>> - if (ret)
>>> - goto probe_failed;
>>> - }
>>> + if (device_dma_configure(dev, drv))
>>> + goto pinctrl_bind_failed;
>>>
>>> ret = driver_sysfs_add(dev);
>>> if (ret) {
>>> @@ -660,6 +683,8 @@ static int really_probe(struct device *dev,
>>> struct device_driver *drv)
>>> if (dev->bus)
>>>
>>> blocking_notifier_call_chain(&dev->bus->p->bus_notifier,
>>>
>>> BUS_NOTIFY_DRIVER_NOT_BOUND, dev);
>>> +
>>> + device_dma_cleanup(dev, drv);
>>> pinctrl_bind_failed:
>>> device_links_no_driver(dev);
>>> devres_release_all(dev);
>>> @@ -1204,6 +1229,7 @@ static void __device_release_driver(struct
>>> device *dev, struct device *parent)
>>> else if (drv->remove)
>>> drv->remove(dev);
>>>
>>> + device_dma_cleanup(dev, drv);
>>> device_links_driver_cleanup(dev);
>>>
>>> devres_release_all(dev);
>>> diff --git a/include/linux/device/driver.h
>>> b/include/linux/device/driver.h
>>> index a498ebcf4993..374a3c2cc10d 100644
>>> --- a/include/linux/device/driver.h
>>> +++ b/include/linux/device/driver.h
>>> @@ -100,6 +100,7 @@ struct device_driver {
>>> const char *mod_name; /* used for built-in
>>> modules */
>>>
>>> bool suppress_bind_attrs; /* disables bind/unbind via
>>> sysfs */
>>> + bool suppress_auto_claim_dma_owner;
>>> enum probe_type probe_type;
>>>
>>> const struct of_device_id *of_match_table;
>>
>> Does this work for you? Can I work towards this in the next version?
>
> A kindly ping ... Is this heading the right direction? I need your
> advice to move ahead. :-)
Can I do it like this? :-)
Best regards,
baolu
On 12/6/21 9:58 AM, Lu Baolu wrote:
> Hi folks,
>
> The iommu group is the minimal isolation boundary for DMA. Devices in
> a group can access each other's MMIO registers via peer to peer DMA
> and also need share the same I/O address space.
>
> Once the I/O address space is assigned to user control it is no longer
> available to the dma_map* API, which effectively makes the DMA API
> non-working.
>
> Second, userspace can use DMA initiated by a device that it controls
> to access the MMIO spaces of other devices in the group. This allows
> userspace to indirectly attack any kernel owned device and it's driver.
>
> Therefore groups must either be entirely under kernel control or
> userspace control, never a mixture. Unfortunately some systems have
> problems with the granularity of groups and there are a couple of
> important exceptions:
>
> - pci_stub allows the admin to block driver binding on a device and
> make it permanently shared with userspace. Since PCI stub does not
> do DMA it is safe, however the admin must understand that using
> pci_stub allows userspace to attack whatever device it was bound
> it.
>
> - PCI bridges are sometimes included in groups. Typically PCI bridges
> do not use DMA, and generally do not have MMIO regions.
>
> Generally any device that does not have any MMIO registers is a
> possible candidate for an exception.
>
> Currently vfio adopts a workaround to detect violations of the above
> restrictions by monitoring the driver core BOUND event, and hardwiring
> the above exceptions. Since there is no way for vfio to reject driver
> binding at this point, BUG_ON() is triggered if a violation is
> captured (kernel driver BOUND event on a group which already has some
> devices assigned to userspace). Aside from the bad user experience
> this opens a way for root userspace to crash the kernel, even in high
> integrity configurations, by manipulating the module binding and
> triggering the BUG_ON.
>
> This series solves this problem by making the user/kernel ownership a
> core concept at the IOMMU layer. The driver core enforces kernel
> ownership while drivers are bound and violations now result in a error
> codes during probe, not BUG_ON failures.
>
> Patch partitions:
> [PATCH 1-9]: Detect DMA ownership conflicts during driver binding;
> [PATCH 10-13]: Add security context management for assigned devices;
> [PATCH 14-18]: Various cleanups.
>
> This is part one of three initial series for IOMMUFD:
> * Move IOMMU Group security into the iommu layer
> - Generic IOMMUFD implementation
> - VFIO ability to consume IOMMUFD
Thank you very much for reviewing my series. The v4 of this series has
been posted here:
https://lore.kernel.org/linux-iommu/[email protected]/
Best regards,
baolu