2023-01-10 03:52:34

by Lu Baolu

[permalink] [raw]
Subject: [PATCH v5 0/5] iommu: Retire detach_dev callback

Hi folks,

The iommu core calls the driver's detach_dev domain op callback only when
a device is finished assigning to user space and
iommu_group_release_dma_owner() is called to return the device to the
kernel, where iommu core wants to set the default domain to the device but
the driver didn't provide one. The code looks like:

/*
* New drivers should support default domains and so the detach_dev() op
* will never be called. Otherwise the NULL domain represents some
* platform specific behavior.
*/
if (!new_domain) {
if (WARN_ON(!group->domain->ops->detach_dev))
return -EINVAL;
__iommu_group_for_each_dev(group, group->domain,
iommu_group_do_detach_device);
group->domain = NULL;
return 0;
}

In other words, if the iommu driver provides default domains, the
.detach_dev callback will never be called; Otherwise, the .detach_dev
callback is actually called to return control back to the platform DMA
ops, other than detaching the domain from device.

This series cleanups this by:

- If the IOMMU driver provides default domains, remove .detach_dev
callback.
- Adds a new set_platform_dma iommu op. Any IOMMU driver that doesn't
provide default domain should implement set_platform_dma callback
instead.
- Retire .detach_dev callback.

This series originates from various discussion in the community. Thanks
to Jason, Robin and all others for their ideas.

The whole series is available on github:
https://github.com/LuBaolu/intel-iommu/commits/iommu-retire-detach_dev-v5

Change log:
v5:
- Merge some patches to make the series cute. No functionality changes.
- Check the op directly and WARN_ON the lack of any necessary
callbacks. Get rid of the ret and EINVAL.

v4:
- https://lore.kernel.org/linux-iommu/[email protected]/
- Drop the patch which renamed .attach_dev to .set_dev. As Robin said,
"... I also wouldn't say that "attach" loses its meaning in a context
where an equivalent "detach" operation is only ever implicit in
reattaching to something else...". If we have a better name in the
future, we can do it in other series.
- Adjust the patch of "iommu: Add set_platform_dma_ops iommu ops"
according to Jason's following suggestion " ... This is a bit ugly,
it would be better to make the previous patch call set_platform_dma
if it is set instead of detach_dev and then these patches should just
rename the driver's fsl_pamu_detach_device to
fsl_pamu_set_platform_dma ..."
- Add a new patch to remove deferred attach check from
__iommu_detach_domain() path. Make it a separate patch as the
prerequisite to remove __iommu_detach_device() helper.
- Rename set_platform_dma to set_platform_dma_ops to make it more
meaningful.

v3:
- https://lore.kernel.org/linux-iommu/[email protected]/
- Setting blocking domain is not conceptually equal to detach_dev.
Dropped all blocking domain related staffs in the previous version.

v2:
- https://lore.kernel.org/linux-iommu/[email protected]/
- Replace .detach_dev callback with static block domain ops;
- Rename .attach_dev to .set_dev.

v1:
- https://lore.kernel.org/linux-iommu/[email protected]/

Jason Gunthorpe (1):
iommu: Remove deferred attach check from __iommu_detach_device()

Lu Baolu (4):
iommu: Remove detach_dev callbacks
iommu: Add set_platform_dma_ops iommu ops
iommu: Add set_platform_dma_ops callbacks
iommu: Remove detach_dev callback

include/linux/iommu.h | 8 ++-
include/trace/events/iommu.h | 7 --
drivers/iommu/amd/iommu.c | 26 -------
drivers/iommu/apple-dart.c | 24 -------
drivers/iommu/arm/arm-smmu/qcom_iommu.c | 23 ------
drivers/iommu/exynos-iommu.c | 1 -
drivers/iommu/fsl_pamu_domain.c | 6 +-
drivers/iommu/iommu-traces.c | 1 -
drivers/iommu/iommu.c | 94 ++++++++++++-------------
drivers/iommu/ipmmu-vmsa.c | 16 -----
drivers/iommu/msm_iommu.c | 6 +-
drivers/iommu/mtk_iommu.c | 9 ---
drivers/iommu/mtk_iommu_v1.c | 4 +-
drivers/iommu/omap-iommu.c | 6 +-
drivers/iommu/rockchip-iommu.c | 1 -
drivers/iommu/s390-iommu.c | 7 +-
drivers/iommu/sprd-iommu.c | 16 -----
drivers/iommu/sun50i-iommu.c | 1 -
drivers/iommu/tegra-gart.c | 6 +-
drivers/iommu/tegra-smmu.c | 5 +-
20 files changed, 69 insertions(+), 198 deletions(-)

--
2.34.1


2023-01-10 04:03:52

by Lu Baolu

[permalink] [raw]
Subject: [PATCH v5 2/5] iommu: Add set_platform_dma_ops iommu ops

When VFIO finishes assigning a device to user space and calls
iommu_group_release_dma_owner() to return the device to kernel, the IOMMU
core will attach the default domain to the device. Unfortunately, some
IOMMU drivers don't support default domain, hence in the end, the core
calls .detach_dev instead.

This adds set_platform_dma_ops iommu ops to make it clear that what it
does is returning control back to the platform DMA ops.

Suggested-by: Jason Gunthorpe <[email protected]>
Reviewed-by: Jason Gunthorpe <[email protected]>
Signed-off-by: Lu Baolu <[email protected]>
---
include/linux/iommu.h | 4 ++++
drivers/iommu/iommu.c | 28 ++++++++++++++++++++++++----
2 files changed, 28 insertions(+), 4 deletions(-)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 46e1347bfa22..7b3e3775b069 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -228,6 +228,9 @@ struct iommu_iotlb_gather {
* @release_device: Remove device from iommu driver handling
* @probe_finalize: Do final setup work after the device is added to an IOMMU
* group and attached to the groups domain
+ * @set_platform_dma_ops: Returning control back to the platform DMA ops. This op
+ * is to support old IOMMU drivers, new drivers should use
+ * default domains, and the common IOMMU DMA ops.
* @device_group: find iommu group for a particular device
* @get_resv_regions: Request list of reserved regions for a device
* @of_xlate: add OF master IDs to iommu grouping
@@ -256,6 +259,7 @@ struct iommu_ops {
struct iommu_device *(*probe_device)(struct device *dev);
void (*release_device)(struct device *dev);
void (*probe_finalize)(struct device *dev);
+ void (*set_platform_dma_ops)(struct device *dev);
struct iommu_group *(*device_group)(struct device *dev);

/* Request/Free a list of reserved regions for a device */
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index de91dd88705b..1c8b2c7678f7 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2163,6 +2163,16 @@ static int iommu_group_do_detach_device(struct device *dev, void *data)
return 0;
}

+static int iommu_group_do_set_platform_dma(struct device *dev, void *data)
+{
+ const struct iommu_ops *ops = dev_iommu_ops(dev);
+
+ if (!WARN_ON(!ops->set_platform_dma_ops))
+ ops->set_platform_dma_ops(dev);
+
+ return 0;
+}
+
static int __iommu_group_set_domain(struct iommu_group *group,
struct iommu_domain *new_domain)
{
@@ -2177,10 +2187,20 @@ static int __iommu_group_set_domain(struct iommu_group *group,
* platform specific behavior.
*/
if (!new_domain) {
- if (WARN_ON(!group->domain->ops->detach_dev))
- return -EINVAL;
- __iommu_group_for_each_dev(group, group->domain,
- iommu_group_do_detach_device);
+ struct group_device *grp_dev;
+
+ grp_dev = list_first_entry(&group->devices,
+ struct group_device, list);
+
+ if (dev_iommu_ops(grp_dev->dev)->set_platform_dma_ops)
+ __iommu_group_for_each_dev(group, NULL,
+ iommu_group_do_set_platform_dma);
+ else if (group->domain->ops->detach_dev)
+ __iommu_group_for_each_dev(group, group->domain,
+ iommu_group_do_detach_device);
+ else
+ WARN_ON_ONCE(1);
+
group->domain = NULL;
return 0;
}
--
2.34.1

2023-01-10 04:03:55

by Lu Baolu

[permalink] [raw]
Subject: [PATCH v5 3/5] iommu: Add set_platform_dma_ops callbacks

For those IOMMU drivers that don't provide default domain support, add an
implementation of set_platform_dma_ops callback so that the IOMMU core
could return the DMA control to platform DMA ops. At the same time, with
the set_platform_dma_ops implemented, there is no need for detach_dev.
Remove it to avoid dead code.

Reviewed-by: Jason Gunthorpe <[email protected]>
Signed-off-by: Lu Baolu <[email protected]>
---
drivers/iommu/fsl_pamu_domain.c | 6 +++---
drivers/iommu/msm_iommu.c | 6 +++---
drivers/iommu/mtk_iommu_v1.c | 4 ++--
drivers/iommu/omap-iommu.c | 6 +++---
drivers/iommu/s390-iommu.c | 7 ++-----
drivers/iommu/tegra-gart.c | 6 +++---
drivers/iommu/tegra-smmu.c | 5 +++--
7 files changed, 19 insertions(+), 21 deletions(-)

diff --git a/drivers/iommu/fsl_pamu_domain.c b/drivers/iommu/fsl_pamu_domain.c
index 4408ac3c49b6..e123161c211a 100644
--- a/drivers/iommu/fsl_pamu_domain.c
+++ b/drivers/iommu/fsl_pamu_domain.c
@@ -283,9 +283,9 @@ static int fsl_pamu_attach_device(struct iommu_domain *domain,
return ret;
}

-static void fsl_pamu_detach_device(struct iommu_domain *domain,
- struct device *dev)
+static void fsl_pamu_set_platform_dma(struct device *dev)
{
+ struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
struct fsl_dma_domain *dma_domain = to_fsl_dma_domain(domain);
const u32 *prop;
int len;
@@ -452,9 +452,9 @@ static const struct iommu_ops fsl_pamu_ops = {
.domain_alloc = fsl_pamu_domain_alloc,
.probe_device = fsl_pamu_probe_device,
.device_group = fsl_pamu_device_group,
+ .set_platform_dma_ops = fsl_pamu_set_platform_dma;
.default_domain_ops = &(const struct iommu_domain_ops) {
.attach_dev = fsl_pamu_attach_device,
- .detach_dev = fsl_pamu_detach_device,
.iova_to_phys = fsl_pamu_iova_to_phys,
.free = fsl_pamu_domain_free,
}
diff --git a/drivers/iommu/msm_iommu.c b/drivers/iommu/msm_iommu.c
index c60624910872..454f6331c889 100644
--- a/drivers/iommu/msm_iommu.c
+++ b/drivers/iommu/msm_iommu.c
@@ -443,9 +443,9 @@ static int msm_iommu_attach_dev(struct iommu_domain *domain, struct device *dev)
return ret;
}

-static void msm_iommu_detach_dev(struct iommu_domain *domain,
- struct device *dev)
+static void msm_iommu_set_platform_dma(struct device *dev)
{
+ struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
struct msm_priv *priv = to_msm_priv(domain);
unsigned long flags;
struct msm_iommu_dev *iommu;
@@ -678,11 +678,11 @@ static struct iommu_ops msm_iommu_ops = {
.domain_alloc = msm_iommu_domain_alloc,
.probe_device = msm_iommu_probe_device,
.device_group = generic_device_group,
+ .set_platform_dma_ops = msm_iommu_set_platform_dma,
.pgsize_bitmap = MSM_IOMMU_PGSIZES,
.of_xlate = qcom_iommu_of_xlate,
.default_domain_ops = &(const struct iommu_domain_ops) {
.attach_dev = msm_iommu_attach_dev,
- .detach_dev = msm_iommu_detach_dev,
.map_pages = msm_iommu_map,
.unmap_pages = msm_iommu_unmap,
/*
diff --git a/drivers/iommu/mtk_iommu_v1.c b/drivers/iommu/mtk_iommu_v1.c
index 69682ee068d2..78d0a84c704f 100644
--- a/drivers/iommu/mtk_iommu_v1.c
+++ b/drivers/iommu/mtk_iommu_v1.c
@@ -319,7 +319,7 @@ static int mtk_iommu_v1_attach_device(struct iommu_domain *domain, struct device
return 0;
}

-static void mtk_iommu_v1_detach_device(struct iommu_domain *domain, struct device *dev)
+static void mtk_iommu_v1_set_platform_dma(struct device *dev)
{
struct mtk_iommu_v1_data *data = dev_iommu_priv_get(dev);

@@ -585,10 +585,10 @@ static const struct iommu_ops mtk_iommu_v1_ops = {
.def_domain_type = mtk_iommu_v1_def_domain_type,
.device_group = generic_device_group,
.pgsize_bitmap = MT2701_IOMMU_PAGE_SIZE,
+ .set_platform_dma_ops = mtk_iommu_v1_set_platform_dma,
.owner = THIS_MODULE,
.default_domain_ops = &(const struct iommu_domain_ops) {
.attach_dev = mtk_iommu_v1_attach_device,
- .detach_dev = mtk_iommu_v1_detach_device,
.map_pages = mtk_iommu_v1_map,
.unmap_pages = mtk_iommu_v1_unmap,
.iova_to_phys = mtk_iommu_v1_iova_to_phys,
diff --git a/drivers/iommu/omap-iommu.c b/drivers/iommu/omap-iommu.c
index 2fd7702c6709..3ab078112a7c 100644
--- a/drivers/iommu/omap-iommu.c
+++ b/drivers/iommu/omap-iommu.c
@@ -1556,9 +1556,9 @@ static void _omap_iommu_detach_dev(struct omap_iommu_domain *omap_domain,
omap_domain->dev = NULL;
}

-static void omap_iommu_detach_dev(struct iommu_domain *domain,
- struct device *dev)
+static void omap_iommu_set_platform_dma(struct device *dev)
{
+ struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
struct omap_iommu_domain *omap_domain = to_omap_domain(domain);

spin_lock(&omap_domain->lock);
@@ -1737,10 +1737,10 @@ static const struct iommu_ops omap_iommu_ops = {
.probe_device = omap_iommu_probe_device,
.release_device = omap_iommu_release_device,
.device_group = omap_iommu_device_group,
+ .set_platform_dma_ops = omap_iommu_set_platform_dma,
.pgsize_bitmap = OMAP_IOMMU_PGSIZES,
.default_domain_ops = &(const struct iommu_domain_ops) {
.attach_dev = omap_iommu_attach_dev,
- .detach_dev = omap_iommu_detach_dev,
.map = omap_iommu_map,
.unmap = omap_iommu_unmap,
.iova_to_phys = omap_iommu_iova_to_phys,
diff --git a/drivers/iommu/s390-iommu.c b/drivers/iommu/s390-iommu.c
index ed33c6cce083..5591dab99446 100644
--- a/drivers/iommu/s390-iommu.c
+++ b/drivers/iommu/s390-iommu.c
@@ -144,13 +144,10 @@ static int s390_iommu_attach_device(struct iommu_domain *domain,
return 0;
}

-static void s390_iommu_detach_device(struct iommu_domain *domain,
- struct device *dev)
+static void s390_iommu_set_platform_dma(struct device *dev)
{
struct zpci_dev *zdev = to_zpci_dev(dev);

- WARN_ON(zdev->s390_domain != to_s390_domain(domain));
-
__s390_iommu_detach_device(zdev);
zpci_dma_init_device(zdev);
}
@@ -435,11 +432,11 @@ static const struct iommu_ops s390_iommu_ops = {
.probe_device = s390_iommu_probe_device,
.release_device = s390_iommu_release_device,
.device_group = generic_device_group,
+ .set_platform_dma_ops = s390_iommu_set_platform_dma,
.pgsize_bitmap = SZ_4K,
.get_resv_regions = s390_iommu_get_resv_regions,
.default_domain_ops = &(const struct iommu_domain_ops) {
.attach_dev = s390_iommu_attach_device,
- .detach_dev = s390_iommu_detach_device,
.map_pages = s390_iommu_map_pages,
.unmap_pages = s390_iommu_unmap_pages,
.flush_iotlb_all = s390_iommu_flush_iotlb_all,
diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index ed53279d1106..a482ff838b53 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -124,9 +124,9 @@ static int gart_iommu_attach_dev(struct iommu_domain *domain,
return ret;
}

-static void gart_iommu_detach_dev(struct iommu_domain *domain,
- struct device *dev)
+static void gart_iommu_set_platform_dma(struct device *dev)
{
+ struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
struct gart_device *gart = gart_handle;

spin_lock(&gart->dom_lock);
@@ -270,11 +270,11 @@ static const struct iommu_ops gart_iommu_ops = {
.domain_alloc = gart_iommu_domain_alloc,
.probe_device = gart_iommu_probe_device,
.device_group = generic_device_group,
+ .set_platform_dma_ops = gart_iommu_set_platform_dma,
.pgsize_bitmap = GART_IOMMU_PGSIZES,
.of_xlate = gart_iommu_of_xlate,
.default_domain_ops = &(const struct iommu_domain_ops) {
.attach_dev = gart_iommu_attach_dev,
- .detach_dev = gart_iommu_detach_dev,
.map = gart_iommu_map,
.unmap = gart_iommu_unmap,
.iova_to_phys = gart_iommu_iova_to_phys,
diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c
index 5b1af40221ec..4c4ac22d5fb1 100644
--- a/drivers/iommu/tegra-smmu.c
+++ b/drivers/iommu/tegra-smmu.c
@@ -511,8 +511,9 @@ static int tegra_smmu_attach_dev(struct iommu_domain *domain,
return err;
}

-static void tegra_smmu_detach_dev(struct iommu_domain *domain, struct device *dev)
+static void tegra_smmu_set_platform_dma(struct device *dev)
{
+ struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
struct tegra_smmu_as *as = to_smmu_as(domain);
struct tegra_smmu *smmu = as->smmu;
@@ -965,11 +966,11 @@ static const struct iommu_ops tegra_smmu_ops = {
.domain_alloc = tegra_smmu_domain_alloc,
.probe_device = tegra_smmu_probe_device,
.device_group = tegra_smmu_device_group,
+ .set_platform_dma_ops = tegra_smmu_set_platform_dma,
.of_xlate = tegra_smmu_of_xlate,
.pgsize_bitmap = SZ_4K,
.default_domain_ops = &(const struct iommu_domain_ops) {
.attach_dev = tegra_smmu_attach_dev,
- .detach_dev = tegra_smmu_detach_dev,
.map = tegra_smmu_map,
.unmap = tegra_smmu_unmap,
.iova_to_phys = tegra_smmu_iova_to_phys,
--
2.34.1

2023-01-13 16:25:09

by Joerg Roedel

[permalink] [raw]
Subject: Re: [PATCH v5 0/5] iommu: Retire detach_dev callback

On Tue, Jan 10, 2023 at 10:54:03AM +0800, Lu Baolu wrote:
> Jason Gunthorpe (1):
> iommu: Remove deferred attach check from __iommu_detach_device()
>
> Lu Baolu (4):
> iommu: Remove detach_dev callbacks
> iommu: Add set_platform_dma_ops iommu ops
> iommu: Add set_platform_dma_ops callbacks
> iommu: Remove detach_dev callback

Applied, thanks.

2023-01-16 17:03:48

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v5 0/5] iommu: Retire detach_dev callback

On Mon, Jan 16, 2023 at 05:24:56PM +0100, Marek Szyprowski wrote:

> > The iommu core calls the driver's detach_dev domain op callback only when
> > a device is finished assigning to user space and
> > iommu_group_release_dma_owner() is called to return the device to the
> > kernel, where iommu core wants to set the default domain to the device but
> > the driver didn't provide one. The code looks like:
> >
> > /*
> > * New drivers should support default domains and so the detach_dev() op
> > * will never be called. Otherwise the NULL domain represents some
> > * platform specific behavior.
> > */
> > if (!new_domain) {
> > if (WARN_ON(!group->domain->ops->detach_dev))
> > return -EINVAL;
> > __iommu_group_for_each_dev(group, group->domain,
> > iommu_group_do_detach_device);
> > group->domain = NULL;
> > return 0;
> > }
> >
> > In other words, if the iommu driver provides default domains, the
> > .detach_dev callback will never be called; Otherwise, the .detach_dev
> > callback is actually called to return control back to the platform DMA
> > ops, other than detaching the domain from device.
> >
> > This series cleanups this by:
> >
> > - If the IOMMU driver provides default domains, remove .detach_dev
> > callback.
> > - Adds a new set_platform_dma iommu op. Any IOMMU driver that doesn't
> > provide default domain should implement set_platform_dma callback
> > instead.
> > - Retire .detach_dev callback.
> >
> > This series originates from various discussion in the community. Thanks
> > to Jason, Robin and all others for their ideas.
>
> I wonder how to handle the ARM 32bit case, which doesn't use the default
> domain solution. Today, once this patchset has been merged to
> linux-next, I've noticed that it broke booting of ARM 32bit Exynos based
> boards.

It is supposed to work, can you help debug what went wrong?

Jason

2023-01-16 18:02:29

by Marek Szyprowski

[permalink] [raw]
Subject: Re: [PATCH v5 0/5] iommu: Retire detach_dev callback

Hi All,


On 10.01.2023 03:54, Lu Baolu wrote:
> Hi folks,
>
> The iommu core calls the driver's detach_dev domain op callback only when
> a device is finished assigning to user space and
> iommu_group_release_dma_owner() is called to return the device to the
> kernel, where iommu core wants to set the default domain to the device but
> the driver didn't provide one. The code looks like:
>
> /*
> * New drivers should support default domains and so the detach_dev() op
> * will never be called. Otherwise the NULL domain represents some
> * platform specific behavior.
> */
> if (!new_domain) {
> if (WARN_ON(!group->domain->ops->detach_dev))
> return -EINVAL;
> __iommu_group_for_each_dev(group, group->domain,
> iommu_group_do_detach_device);
> group->domain = NULL;
> return 0;
> }
>
> In other words, if the iommu driver provides default domains, the
> .detach_dev callback will never be called; Otherwise, the .detach_dev
> callback is actually called to return control back to the platform DMA
> ops, other than detaching the domain from device.
>
> This series cleanups this by:
>
> - If the IOMMU driver provides default domains, remove .detach_dev
> callback.
> - Adds a new set_platform_dma iommu op. Any IOMMU driver that doesn't
> provide default domain should implement set_platform_dma callback
> instead.
> - Retire .detach_dev callback.
>
> This series originates from various discussion in the community. Thanks
> to Jason, Robin and all others for their ideas.

I wonder how to handle the ARM 32bit case, which doesn't use the default
domain solution. Today, once this patchset has been merged to
linux-next, I've noticed that it broke booting of ARM 32bit Exynos based
boards.

The final solution would be to switch ARM 32bit to generic DMA-IOMMU
glue, but I'm not sure it this will happen soon. I will try to check if
any kind of quick workaround can be applied to get it working again.


> The whole series is available on github:
> https://protect2.fireeye.com/v1/url?k=ef2183b6-b0bababa-ef2008f9-000babff3563-336828b3433153d2&q=1&e=4ae5dae4-383f-4a82-9449-3f08f0422cb1&u=https%3A%2F%2Fgithub.com%2FLuBaolu%2Fintel-iommu%2Fcommits%2Fiommu-retire-detach_dev-v5
>
> Change log:
> v5:
> - Merge some patches to make the series cute. No functionality changes.
> - Check the op directly and WARN_ON the lack of any necessary
> callbacks. Get rid of the ret and EINVAL.
>
> v4:
> - https://lore.kernel.org/linux-iommu/[email protected]/
> - Drop the patch which renamed .attach_dev to .set_dev. As Robin said,
> "... I also wouldn't say that "attach" loses its meaning in a context
> where an equivalent "detach" operation is only ever implicit in
> reattaching to something else...". If we have a better name in the
> future, we can do it in other series.
> - Adjust the patch of "iommu: Add set_platform_dma_ops iommu ops"
> according to Jason's following suggestion " ... This is a bit ugly,
> it would be better to make the previous patch call set_platform_dma
> if it is set instead of detach_dev and then these patches should just
> rename the driver's fsl_pamu_detach_device to
> fsl_pamu_set_platform_dma ..."
> - Add a new patch to remove deferred attach check from
> __iommu_detach_domain() path. Make it a separate patch as the
> prerequisite to remove __iommu_detach_device() helper.
> - Rename set_platform_dma to set_platform_dma_ops to make it more
> meaningful.
>
> v3:
> - https://lore.kernel.org/linux-iommu/[email protected]/
> - Setting blocking domain is not conceptually equal to detach_dev.
> Dropped all blocking domain related staffs in the previous version.
>
> v2:
> - https://lore.kernel.org/linux-iommu/[email protected]/
> - Replace .detach_dev callback with static block domain ops;
> - Rename .attach_dev to .set_dev.
>
> v1:
> - https://lore.kernel.org/linux-iommu/[email protected]/
>
> Jason Gunthorpe (1):
> iommu: Remove deferred attach check from __iommu_detach_device()
>
> Lu Baolu (4):
> iommu: Remove detach_dev callbacks
> iommu: Add set_platform_dma_ops iommu ops
> iommu: Add set_platform_dma_ops callbacks
> iommu: Remove detach_dev callback
>
> include/linux/iommu.h | 8 ++-
> include/trace/events/iommu.h | 7 --
> drivers/iommu/amd/iommu.c | 26 -------
> drivers/iommu/apple-dart.c | 24 -------
> drivers/iommu/arm/arm-smmu/qcom_iommu.c | 23 ------
> drivers/iommu/exynos-iommu.c | 1 -
> drivers/iommu/fsl_pamu_domain.c | 6 +-
> drivers/iommu/iommu-traces.c | 1 -
> drivers/iommu/iommu.c | 94 ++++++++++++-------------
> drivers/iommu/ipmmu-vmsa.c | 16 -----
> drivers/iommu/msm_iommu.c | 6 +-
> drivers/iommu/mtk_iommu.c | 9 ---
> drivers/iommu/mtk_iommu_v1.c | 4 +-
> drivers/iommu/omap-iommu.c | 6 +-
> drivers/iommu/rockchip-iommu.c | 1 -
> drivers/iommu/s390-iommu.c | 7 +-
> drivers/iommu/sprd-iommu.c | 16 -----
> drivers/iommu/sun50i-iommu.c | 1 -
> drivers/iommu/tegra-gart.c | 6 +-
> drivers/iommu/tegra-smmu.c | 5 +-
> 20 files changed, 69 insertions(+), 198 deletions(-)
>
Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland

2023-03-15 15:51:04

by Steven Price

[permalink] [raw]
Subject: Re: [PATCH v5 0/5] iommu: Retire detach_dev callback

On 10/01/2023 02:54, Lu Baolu wrote:
> Hi folks,
>
> The iommu core calls the driver's detach_dev domain op callback only when
> a device is finished assigning to user space and
> iommu_group_release_dma_owner() is called to return the device to the
> kernel, where iommu core wants to set the default domain to the device but
> the driver didn't provide one. The code looks like:
>
> /*
> * New drivers should support default domains and so the detach_dev() op
> * will never be called. Otherwise the NULL domain represents some
> * platform specific behavior.
> */
> if (!new_domain) {
> if (WARN_ON(!group->domain->ops->detach_dev))
> return -EINVAL;
> __iommu_group_for_each_dev(group, group->domain,
> iommu_group_do_detach_device);
> group->domain = NULL;
> return 0;
> }
>
> In other words, if the iommu driver provides default domains, the
> .detach_dev callback will never be called; Otherwise, the .detach_dev
> callback is actually called to return control back to the platform DMA
> ops, other than detaching the domain from device.
>
> This series cleanups this by:
>
> - If the IOMMU driver provides default domains, remove .detach_dev
> callback.
> - Adds a new set_platform_dma iommu op. Any IOMMU driver that doesn't
> provide default domain should implement set_platform_dma callback
> instead.
> - Retire .detach_dev callback.
>
> This series originates from various discussion in the community. Thanks
> to Jason, Robin and all others for their ideas.
>
> The whole series is available on github:
> https://github.com/LuBaolu/intel-iommu/commits/iommu-retire-detach_dev-v5
>
> Change log:
> v5:
> - Merge some patches to make the series cute. No functionality changes.
> - Check the op directly and WARN_ON the lack of any necessary
> callbacks. Get rid of the ret and EINVAL.
>
> v4:
> - https://lore.kernel.org/linux-iommu/[email protected]/
> - Drop the patch which renamed .attach_dev to .set_dev. As Robin said,
> "... I also wouldn't say that "attach" loses its meaning in a context
> where an equivalent "detach" operation is only ever implicit in
> reattaching to something else...". If we have a better name in the
> future, we can do it in other series.
> - Adjust the patch of "iommu: Add set_platform_dma_ops iommu ops"
> according to Jason's following suggestion " ... This is a bit ugly,
> it would be better to make the previous patch call set_platform_dma
> if it is set instead of detach_dev and then these patches should just
> rename the driver's fsl_pamu_detach_device to
> fsl_pamu_set_platform_dma ..."
> - Add a new patch to remove deferred attach check from
> __iommu_detach_domain() path. Make it a separate patch as the
> prerequisite to remove __iommu_detach_device() helper.
> - Rename set_platform_dma to set_platform_dma_ops to make it more
> meaningful.
>
> v3:
> - https://lore.kernel.org/linux-iommu/[email protected]/
> - Setting blocking domain is not conceptually equal to detach_dev.
> Dropped all blocking domain related staffs in the previous version.
>
> v2:
> - https://lore.kernel.org/linux-iommu/[email protected]/
> - Replace .detach_dev callback with static block domain ops;
> - Rename .attach_dev to .set_dev.
>
> v1:
> - https://lore.kernel.org/linux-iommu/[email protected]/
>
> Jason Gunthorpe (1):
> iommu: Remove deferred attach check from __iommu_detach_device()
>
> Lu Baolu (4):
> iommu: Remove detach_dev callbacks
> iommu: Add set_platform_dma_ops iommu ops
> iommu: Add set_platform_dma_ops callbacks
> iommu: Remove detach_dev callback
>
> include/linux/iommu.h | 8 ++-
> include/trace/events/iommu.h | 7 --
> drivers/iommu/amd/iommu.c | 26 -------
> drivers/iommu/apple-dart.c | 24 -------
> drivers/iommu/arm/arm-smmu/qcom_iommu.c | 23 ------
> drivers/iommu/exynos-iommu.c | 1 -
> drivers/iommu/fsl_pamu_domain.c | 6 +-
> drivers/iommu/iommu-traces.c | 1 -
> drivers/iommu/iommu.c | 94 ++++++++++++-------------
> drivers/iommu/ipmmu-vmsa.c | 16 -----
> drivers/iommu/msm_iommu.c | 6 +-
> drivers/iommu/mtk_iommu.c | 9 ---
> drivers/iommu/mtk_iommu_v1.c | 4 +-
> drivers/iommu/omap-iommu.c | 6 +-
> drivers/iommu/rockchip-iommu.c | 1 -
> drivers/iommu/s390-iommu.c | 7 +-
> drivers/iommu/sprd-iommu.c | 16 -----
> drivers/iommu/sun50i-iommu.c | 1 -
> drivers/iommu/tegra-gart.c | 6 +-
> drivers/iommu/tegra-smmu.c | 5 +-
> 20 files changed, 69 insertions(+), 198 deletions(-)
>

I hit a problem with this series on a Firefly-RK3288, I bisected down to
1b932ceddd19 ("iommu: Remove detach_dev callbacks"). The first splat is:

[ 7.227853] ------------[ cut here ]------------
[ 7.227900] WARNING: CPU: 0 PID: 9 at drivers/iommu/iommu.c:2198 __iommu_group_set_domain+0xb4/0xc8
[ 7.227920] Modules linked in:
[ 7.227935] CPU: 0 PID: 9 Comm: kworker/u8:0 Not tainted 6.3.0-rc1 #1
[ 7.227942] Hardware name: Rockchip (Device Tree)
[ 7.227948] Workqueue: events_unbound deferred_probe_work_func
[ 7.227964] unwind_backtrace from show_stack+0x10/0x14
[ 7.227978] show_stack from dump_stack_lvl+0x58/0x70
[ 7.227992] dump_stack_lvl from __warn+0x7c/0x1dc
[ 7.228008] __warn from warn_slowpath_fmt+0xbc/0x1a0
[ 7.228022] warn_slowpath_fmt from __iommu_group_set_domain+0xb4/0xc8
[ 7.228035] __iommu_group_set_domain from iommu_detach_device+0x84/0xf8
[ 7.228046] iommu_detach_device from arm_iommu_detach_device+0x24/0xc4
[ 7.228057] arm_iommu_detach_device from rockchip_drm_dma_attach_device+0x30/0x74
[ 7.228073] rockchip_drm_dma_attach_device from vop_crtc_atomic_enable+0xf8/0xab0
[ 7.228085] vop_crtc_atomic_enable from drm_atomic_helper_commit_modeset_enables+0xb0/0x2a0
[ 7.228097] drm_atomic_helper_commit_modeset_enables from drm_atomic_helper_commit_tail_rpm+0x44/0x8c
[ 7.228111] drm_atomic_helper_commit_tail_rpm from commit_tail+0x9c/0x184
[ 7.228124] commit_tail from drm_atomic_helper_commit+0x164/0x18c
[ 7.228137] drm_atomic_helper_commit from drm_atomic_commit+0xb0/0xe8
[ 7.228153] drm_atomic_commit from drm_client_modeset_commit_atomic+0x240/0x288
[ 7.228166] drm_client_modeset_commit_atomic from drm_client_modeset_commit_locked+0x60/0x1cc
[ 7.228174] drm_client_modeset_commit_locked from drm_client_modeset_commit+0x24/0x40
[ 7.228183] drm_client_modeset_commit from drm_fb_helper_set_par+0xb8/0xf8
[ 7.228197] drm_fb_helper_set_par from fbcon_init+0x2a0/0x534
[ 7.228211] fbcon_init from visual_init+0xc0/0x108
[ 7.228224] visual_init from do_bind_con_driver+0x1bc/0x3a8
[ 7.228237] do_bind_con_driver from do_take_over_console+0x134/0x1d4
[ 7.228251] do_take_over_console from do_fbcon_takeover+0x6c/0xcc
[ 7.228264] do_fbcon_takeover from fbcon_fb_registered+0x198/0x1a8
[ 7.228277] fbcon_fb_registered from register_framebuffer+0x1d0/0x268
[ 7.228292] register_framebuffer from __drm_fb_helper_initial_config_and_unlock+0x358/0x578
[ 7.228308] __drm_fb_helper_initial_config_and_unlock from drm_fbdev_client_hotplug+0x6c/0xa8
[ 7.228322] drm_fbdev_client_hotplug from drm_fbdev_generic_setup+0x84/0x174
[ 7.228335] drm_fbdev_generic_setup from rockchip_drm_bind+0x1dc/0x200
[ 7.228349] rockchip_drm_bind from try_to_bring_up_aggregate_device+0x200/0x2d8
[ 7.228368] try_to_bring_up_aggregate_device from component_master_add_with_match+0xc4/0xf8
[ 7.228380] component_master_add_with_match from rockchip_drm_platform_probe+0x150/0x254
[ 7.228392] rockchip_drm_platform_probe from platform_probe+0x5c/0xb8
[ 7.228405] platform_probe from really_probe+0xe0/0x400
[ 7.228417] really_probe from __driver_probe_device+0x9c/0x1fc
[ 7.228431] __driver_probe_device from driver_probe_device+0x30/0xc0
[ 7.228445] driver_probe_device from __device_attach_driver+0xa8/0x120
[ 7.228460] __device_attach_driver from bus_for_each_drv+0x84/0xdc
[ 7.228474] bus_for_each_drv from __device_attach+0xb0/0x20c
[ 7.228488] __device_attach from bus_probe_device+0x8c/0x90
[ 7.228502] bus_probe_device from deferred_probe_work_func+0x98/0xe0
[ 7.228515] deferred_probe_work_func from process_one_work+0x290/0x740
[ 7.228528] process_one_work from worker_thread+0x54/0x518
[ 7.228536] worker_thread from kthread+0xf0/0x110
[ 7.228548] kthread from ret_from_fork+0x14/0x2c
[ 7.228561] Exception stack(0xf084dfb0 to 0xf084dff8)
[ 7.228567] dfa0: 00000000 00000000 00000000 00000000
[ 7.228573] dfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[ 7.228579] dfe0: 00000000 00000000 00000000 00000000 00000013 00000000
[ 7.228585] irq event stamp: 138379
[ 7.228592] hardirqs last enabled at (138385): [<c03c42ac>] vprintk_emit+0x330/0x354
[ 7.228606] hardirqs last disabled at (138390): [<c03c4268>] vprintk_emit+0x2ec/0x354
[ 7.228617] softirqs last enabled at (137258): [<c03016ac>] __do_softirq+0x2f8/0x548
[ 7.228628] softirqs last disabled at (137253): [<c0350218>] __irq_exit_rcu+0x14c/0x170
[ 7.228639] ---[ end trace 0000000000000000 ]---

(complete log attached)

I'm not sure how to debug this. From the callstack I can see that
__iommu_group_set_domain is getting a NULL new_domain which triggers
a call to __iommu_group_do_set_platform_dma which is WARNing because
there is no set_platform_dma_ops callback. The NULL new_domain is
because group->default_domain is NULL in __iommu_group_set_core_domain.

So is the logic in the first patch that there is a default domain
incorrect for this rockchip driver? Or is this callpath just hitting
the function before the default domain can be created?

Help debugging this would be appreciated - I'm not very familiar
with this area of code.

Thanks,

Steve


Attachments:
bootlog.txt (54.40 kB)

2023-03-15 15:59:04

by Robin Murphy

[permalink] [raw]
Subject: Re: [PATCH v5 0/5] iommu: Retire detach_dev callback

On 2023-03-15 15:49, Steven Price wrote:
> On 10/01/2023 02:54, Lu Baolu wrote:
>> Hi folks,
>>
>> The iommu core calls the driver's detach_dev domain op callback only when
>> a device is finished assigning to user space and
>> iommu_group_release_dma_owner() is called to return the device to the
>> kernel, where iommu core wants to set the default domain to the device but
>> the driver didn't provide one. The code looks like:
>>
>> /*
>> * New drivers should support default domains and so the detach_dev() op
>> * will never be called. Otherwise the NULL domain represents some
>> * platform specific behavior.
>> */
>> if (!new_domain) {
>> if (WARN_ON(!group->domain->ops->detach_dev))
>> return -EINVAL;
>> __iommu_group_for_each_dev(group, group->domain,
>> iommu_group_do_detach_device);
>> group->domain = NULL;
>> return 0;
>> }
>>
>> In other words, if the iommu driver provides default domains, the
>> .detach_dev callback will never be called; Otherwise, the .detach_dev
>> callback is actually called to return control back to the platform DMA
>> ops, other than detaching the domain from device.
>>
>> This series cleanups this by:
>>
>> - If the IOMMU driver provides default domains, remove .detach_dev
>> callback.
>> - Adds a new set_platform_dma iommu op. Any IOMMU driver that doesn't
>> provide default domain should implement set_platform_dma callback
>> instead.
>> - Retire .detach_dev callback.
>>
>> This series originates from various discussion in the community. Thanks
>> to Jason, Robin and all others for their ideas.
>>
>> The whole series is available on github:
>> https://github.com/LuBaolu/intel-iommu/commits/iommu-retire-detach_dev-v5
>>
>> Change log:
>> v5:
>> - Merge some patches to make the series cute. No functionality changes.
>> - Check the op directly and WARN_ON the lack of any necessary
>> callbacks. Get rid of the ret and EINVAL.
>>
>> v4:
>> - https://lore.kernel.org/linux-iommu/[email protected]/
>> - Drop the patch which renamed .attach_dev to .set_dev. As Robin said,
>> "... I also wouldn't say that "attach" loses its meaning in a context
>> where an equivalent "detach" operation is only ever implicit in
>> reattaching to something else...". If we have a better name in the
>> future, we can do it in other series.
>> - Adjust the patch of "iommu: Add set_platform_dma_ops iommu ops"
>> according to Jason's following suggestion " ... This is a bit ugly,
>> it would be better to make the previous patch call set_platform_dma
>> if it is set instead of detach_dev and then these patches should just
>> rename the driver's fsl_pamu_detach_device to
>> fsl_pamu_set_platform_dma ..."
>> - Add a new patch to remove deferred attach check from
>> __iommu_detach_domain() path. Make it a separate patch as the
>> prerequisite to remove __iommu_detach_device() helper.
>> - Rename set_platform_dma to set_platform_dma_ops to make it more
>> meaningful.
>>
>> v3:
>> - https://lore.kernel.org/linux-iommu/[email protected]/
>> - Setting blocking domain is not conceptually equal to detach_dev.
>> Dropped all blocking domain related staffs in the previous version.
>>
>> v2:
>> - https://lore.kernel.org/linux-iommu/[email protected]/
>> - Replace .detach_dev callback with static block domain ops;
>> - Rename .attach_dev to .set_dev.
>>
>> v1:
>> - https://lore.kernel.org/linux-iommu/[email protected]/
>>
>> Jason Gunthorpe (1):
>> iommu: Remove deferred attach check from __iommu_detach_device()
>>
>> Lu Baolu (4):
>> iommu: Remove detach_dev callbacks
>> iommu: Add set_platform_dma_ops iommu ops
>> iommu: Add set_platform_dma_ops callbacks
>> iommu: Remove detach_dev callback
>>
>> include/linux/iommu.h | 8 ++-
>> include/trace/events/iommu.h | 7 --
>> drivers/iommu/amd/iommu.c | 26 -------
>> drivers/iommu/apple-dart.c | 24 -------
>> drivers/iommu/arm/arm-smmu/qcom_iommu.c | 23 ------
>> drivers/iommu/exynos-iommu.c | 1 -
>> drivers/iommu/fsl_pamu_domain.c | 6 +-
>> drivers/iommu/iommu-traces.c | 1 -
>> drivers/iommu/iommu.c | 94 ++++++++++++-------------
>> drivers/iommu/ipmmu-vmsa.c | 16 -----
>> drivers/iommu/msm_iommu.c | 6 +-
>> drivers/iommu/mtk_iommu.c | 9 ---
>> drivers/iommu/mtk_iommu_v1.c | 4 +-
>> drivers/iommu/omap-iommu.c | 6 +-
>> drivers/iommu/rockchip-iommu.c | 1 -
>> drivers/iommu/s390-iommu.c | 7 +-
>> drivers/iommu/sprd-iommu.c | 16 -----
>> drivers/iommu/sun50i-iommu.c | 1 -
>> drivers/iommu/tegra-gart.c | 6 +-
>> drivers/iommu/tegra-smmu.c | 5 +-
>> 20 files changed, 69 insertions(+), 198 deletions(-)
>>
>
> I hit a problem with this series on a Firefly-RK3288, I bisected down to
> 1b932ceddd19 ("iommu: Remove detach_dev callbacks"). The first splat is:
>
> [ 7.227853] ------------[ cut here ]------------
> [ 7.227900] WARNING: CPU: 0 PID: 9 at drivers/iommu/iommu.c:2198 __iommu_group_set_domain+0xb4/0xc8
> [ 7.227920] Modules linked in:
> [ 7.227935] CPU: 0 PID: 9 Comm: kworker/u8:0 Not tainted 6.3.0-rc1 #1
> [ 7.227942] Hardware name: Rockchip (Device Tree)
> [ 7.227948] Workqueue: events_unbound deferred_probe_work_func
> [ 7.227964] unwind_backtrace from show_stack+0x10/0x14
> [ 7.227978] show_stack from dump_stack_lvl+0x58/0x70
> [ 7.227992] dump_stack_lvl from __warn+0x7c/0x1dc
> [ 7.228008] __warn from warn_slowpath_fmt+0xbc/0x1a0
> [ 7.228022] warn_slowpath_fmt from __iommu_group_set_domain+0xb4/0xc8
> [ 7.228035] __iommu_group_set_domain from iommu_detach_device+0x84/0xf8
> [ 7.228046] iommu_detach_device from arm_iommu_detach_device+0x24/0xc4
> [ 7.228057] arm_iommu_detach_device from rockchip_drm_dma_attach_device+0x30/0x74
> [ 7.228073] rockchip_drm_dma_attach_device from vop_crtc_atomic_enable+0xf8/0xab0
> [ 7.228085] vop_crtc_atomic_enable from drm_atomic_helper_commit_modeset_enables+0xb0/0x2a0
> [ 7.228097] drm_atomic_helper_commit_modeset_enables from drm_atomic_helper_commit_tail_rpm+0x44/0x8c
> [ 7.228111] drm_atomic_helper_commit_tail_rpm from commit_tail+0x9c/0x184
> [ 7.228124] commit_tail from drm_atomic_helper_commit+0x164/0x18c
> [ 7.228137] drm_atomic_helper_commit from drm_atomic_commit+0xb0/0xe8
> [ 7.228153] drm_atomic_commit from drm_client_modeset_commit_atomic+0x240/0x288
> [ 7.228166] drm_client_modeset_commit_atomic from drm_client_modeset_commit_locked+0x60/0x1cc
> [ 7.228174] drm_client_modeset_commit_locked from drm_client_modeset_commit+0x24/0x40
> [ 7.228183] drm_client_modeset_commit from drm_fb_helper_set_par+0xb8/0xf8
> [ 7.228197] drm_fb_helper_set_par from fbcon_init+0x2a0/0x534
> [ 7.228211] fbcon_init from visual_init+0xc0/0x108
> [ 7.228224] visual_init from do_bind_con_driver+0x1bc/0x3a8
> [ 7.228237] do_bind_con_driver from do_take_over_console+0x134/0x1d4
> [ 7.228251] do_take_over_console from do_fbcon_takeover+0x6c/0xcc
> [ 7.228264] do_fbcon_takeover from fbcon_fb_registered+0x198/0x1a8
> [ 7.228277] fbcon_fb_registered from register_framebuffer+0x1d0/0x268
> [ 7.228292] register_framebuffer from __drm_fb_helper_initial_config_and_unlock+0x358/0x578
> [ 7.228308] __drm_fb_helper_initial_config_and_unlock from drm_fbdev_client_hotplug+0x6c/0xa8
> [ 7.228322] drm_fbdev_client_hotplug from drm_fbdev_generic_setup+0x84/0x174
> [ 7.228335] drm_fbdev_generic_setup from rockchip_drm_bind+0x1dc/0x200
> [ 7.228349] rockchip_drm_bind from try_to_bring_up_aggregate_device+0x200/0x2d8
> [ 7.228368] try_to_bring_up_aggregate_device from component_master_add_with_match+0xc4/0xf8
> [ 7.228380] component_master_add_with_match from rockchip_drm_platform_probe+0x150/0x254
> [ 7.228392] rockchip_drm_platform_probe from platform_probe+0x5c/0xb8
> [ 7.228405] platform_probe from really_probe+0xe0/0x400
> [ 7.228417] really_probe from __driver_probe_device+0x9c/0x1fc
> [ 7.228431] __driver_probe_device from driver_probe_device+0x30/0xc0
> [ 7.228445] driver_probe_device from __device_attach_driver+0xa8/0x120
> [ 7.228460] __device_attach_driver from bus_for_each_drv+0x84/0xdc
> [ 7.228474] bus_for_each_drv from __device_attach+0xb0/0x20c
> [ 7.228488] __device_attach from bus_probe_device+0x8c/0x90
> [ 7.228502] bus_probe_device from deferred_probe_work_func+0x98/0xe0
> [ 7.228515] deferred_probe_work_func from process_one_work+0x290/0x740
> [ 7.228528] process_one_work from worker_thread+0x54/0x518
> [ 7.228536] worker_thread from kthread+0xf0/0x110
> [ 7.228548] kthread from ret_from_fork+0x14/0x2c
> [ 7.228561] Exception stack(0xf084dfb0 to 0xf084dff8)
> [ 7.228567] dfa0: 00000000 00000000 00000000 00000000
> [ 7.228573] dfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> [ 7.228579] dfe0: 00000000 00000000 00000000 00000000 00000013 00000000
> [ 7.228585] irq event stamp: 138379
> [ 7.228592] hardirqs last enabled at (138385): [<c03c42ac>] vprintk_emit+0x330/0x354
> [ 7.228606] hardirqs last disabled at (138390): [<c03c4268>] vprintk_emit+0x2ec/0x354
> [ 7.228617] softirqs last enabled at (137258): [<c03016ac>] __do_softirq+0x2f8/0x548
> [ 7.228628] softirqs last disabled at (137253): [<c0350218>] __irq_exit_rcu+0x14c/0x170
> [ 7.228639] ---[ end trace 0000000000000000 ]---
>
> (complete log attached)
>
> I'm not sure how to debug this. From the callstack I can see that
> __iommu_group_set_domain is getting a NULL new_domain which triggers
> a call to __iommu_group_do_set_platform_dma which is WARNing because
> there is no set_platform_dma_ops callback. The NULL new_domain is
> because group->default_domain is NULL in __iommu_group_set_core_domain.
>
> So is the logic in the first patch that there is a default domain
> incorrect for this rockchip driver? Or is this callpath just hitting
> the function before the default domain can be created?
>
> Help debugging this would be appreciated - I'm not very familiar
> with this area of code.

It's the same thing Marek hit on Exynos: drivers which run on 32-bit Arm
need a .set_platform_dma op, regardless of whether they support default
domains on other platforms which use those.

Cheers,
Robin.

2023-03-15 16:36:18

by Steven Price

[permalink] [raw]
Subject: Re: [PATCH v5 0/5] iommu: Retire detach_dev callback

On 15/03/2023 15:57, Robin Murphy wrote:
> On 2023-03-15 15:49, Steven Price wrote:
>> On 10/01/2023 02:54, Lu Baolu wrote:
>>> Hi folks,
>>>
>>> The iommu core calls the driver's detach_dev domain op callback only
>>> when
>>> a device is finished assigning to user space and
>>> iommu_group_release_dma_owner() is called to return the device to the
>>> kernel, where iommu core wants to set the default domain to the
>>> device but
>>> the driver didn't provide one. The code looks like:
>>>
>>>          /*
>>>           * New drivers should support default domains and so the
>>> detach_dev() op
>>>           * will never be called. Otherwise the NULL domain
>>> represents some
>>>           * platform specific behavior.
>>>           */
>>>          if (!new_domain) {
>>>                  if (WARN_ON(!group->domain->ops->detach_dev))
>>>                          return -EINVAL;
>>>                  __iommu_group_for_each_dev(group, group->domain,
>>>                                            
>>> iommu_group_do_detach_device);
>>>                  group->domain = NULL;
>>>                  return 0;
>>>          }
>>>
>>> In other words, if the iommu driver provides default domains, the
>>> .detach_dev callback will never be called; Otherwise, the .detach_dev
>>> callback is actually called to return control back to the platform DMA
>>> ops, other than detaching the domain from device.
>>>
>>> This series cleanups this by:
>>>
>>> - If the IOMMU driver provides default domains, remove .detach_dev
>>>    callback.
>>> - Adds a new set_platform_dma iommu op. Any IOMMU driver that doesn't
>>>    provide default domain should implement set_platform_dma callback
>>>    instead.
>>> - Retire .detach_dev callback.
>>>
>>> This series originates from various discussion in the community. Thanks
>>> to Jason, Robin and all others for their ideas.
>>>
>>> The whole series is available on github:
>>> https://github.com/LuBaolu/intel-iommu/commits/iommu-retire-detach_dev-v5
>>>
>>> Change log:
>>> v5:
>>>   - Merge some patches to make the series cute. No functionality
>>> changes.
>>>   - Check the op directly and WARN_ON the lack of any necessary
>>>     callbacks. Get rid of the ret and EINVAL.
>>>
>>> v4:
>>>   -
>>> https://lore.kernel.org/linux-iommu/[email protected]/
>>>   - Drop the patch which renamed .attach_dev to .set_dev. As Robin said,
>>>     "... I also wouldn't say that "attach" loses its meaning in a
>>> context
>>>     where an equivalent "detach" operation is only ever implicit in
>>>     reattaching to something else...". If we have a better name in the
>>>     future, we can do it in other series.
>>>   - Adjust the patch of "iommu: Add set_platform_dma_ops iommu ops"
>>>     according to Jason's following suggestion " ... This is a bit ugly,
>>>     it would be better to make the previous patch call set_platform_dma
>>>     if it is set instead of detach_dev and then these patches should
>>> just
>>>     rename the driver's fsl_pamu_detach_device to
>>>     fsl_pamu_set_platform_dma ..."
>>>   - Add a new patch to remove deferred attach check from
>>>     __iommu_detach_domain() path. Make it a separate patch as the
>>>     prerequisite to remove __iommu_detach_device() helper.
>>>   - Rename set_platform_dma to set_platform_dma_ops to make it more
>>>     meaningful.
>>>
>>> v3:
>>>   -
>>> https://lore.kernel.org/linux-iommu/[email protected]/
>>>   - Setting blocking domain is not conceptually equal to detach_dev.
>>>     Dropped all blocking domain related staffs in the previous version.
>>>
>>> v2:
>>>   -
>>> https://lore.kernel.org/linux-iommu/[email protected]/
>>>   - Replace .detach_dev callback with static block domain ops;
>>>   - Rename .attach_dev to .set_dev.
>>>
>>> v1:
>>>   -
>>> https://lore.kernel.org/linux-iommu/[email protected]/
>>>
>>> Jason Gunthorpe (1):
>>>    iommu: Remove deferred attach check from __iommu_detach_device()
>>>
>>> Lu Baolu (4):
>>>    iommu: Remove detach_dev callbacks
>>>    iommu: Add set_platform_dma_ops iommu ops
>>>    iommu: Add set_platform_dma_ops callbacks
>>>    iommu: Remove detach_dev callback
>>>
>>>   include/linux/iommu.h                   |  8 ++-
>>>   include/trace/events/iommu.h            |  7 --
>>>   drivers/iommu/amd/iommu.c               | 26 -------
>>>   drivers/iommu/apple-dart.c              | 24 -------
>>>   drivers/iommu/arm/arm-smmu/qcom_iommu.c | 23 ------
>>>   drivers/iommu/exynos-iommu.c            |  1 -
>>>   drivers/iommu/fsl_pamu_domain.c         |  6 +-
>>>   drivers/iommu/iommu-traces.c            |  1 -
>>>   drivers/iommu/iommu.c                   | 94 ++++++++++++-------------
>>>   drivers/iommu/ipmmu-vmsa.c              | 16 -----
>>>   drivers/iommu/msm_iommu.c               |  6 +-
>>>   drivers/iommu/mtk_iommu.c               |  9 ---
>>>   drivers/iommu/mtk_iommu_v1.c            |  4 +-
>>>   drivers/iommu/omap-iommu.c              |  6 +-
>>>   drivers/iommu/rockchip-iommu.c          |  1 -
>>>   drivers/iommu/s390-iommu.c              |  7 +-
>>>   drivers/iommu/sprd-iommu.c              | 16 -----
>>>   drivers/iommu/sun50i-iommu.c            |  1 -
>>>   drivers/iommu/tegra-gart.c              |  6 +-
>>>   drivers/iommu/tegra-smmu.c              |  5 +-
>>>   20 files changed, 69 insertions(+), 198 deletions(-)
>>>
>>
>> I hit a problem with this series on a Firefly-RK3288, I bisected down to
>> 1b932ceddd19 ("iommu: Remove detach_dev callbacks"). The first splat is:
>>
>> [    7.227853] ------------[ cut here ]------------
>> [    7.227900] WARNING: CPU: 0 PID: 9 at drivers/iommu/iommu.c:2198
>> __iommu_group_set_domain+0xb4/0xc8
>> [    7.227920] Modules linked in:
>> [    7.227935] CPU: 0 PID: 9 Comm: kworker/u8:0 Not tainted 6.3.0-rc1 #1
>> [    7.227942] Hardware name: Rockchip (Device Tree)
>> [    7.227948] Workqueue: events_unbound deferred_probe_work_func
>> [    7.227964]  unwind_backtrace from show_stack+0x10/0x14
>> [    7.227978]  show_stack from dump_stack_lvl+0x58/0x70
>> [    7.227992]  dump_stack_lvl from __warn+0x7c/0x1dc
>> [    7.228008]  __warn from warn_slowpath_fmt+0xbc/0x1a0
>> [    7.228022]  warn_slowpath_fmt from __iommu_group_set_domain+0xb4/0xc8
>> [    7.228035]  __iommu_group_set_domain from
>> iommu_detach_device+0x84/0xf8
>> [    7.228046]  iommu_detach_device from
>> arm_iommu_detach_device+0x24/0xc4
>> [    7.228057]  arm_iommu_detach_device from
>> rockchip_drm_dma_attach_device+0x30/0x74
>> [    7.228073]  rockchip_drm_dma_attach_device from
>> vop_crtc_atomic_enable+0xf8/0xab0
>> [    7.228085]  vop_crtc_atomic_enable from
>> drm_atomic_helper_commit_modeset_enables+0xb0/0x2a0
>> [    7.228097]  drm_atomic_helper_commit_modeset_enables from
>> drm_atomic_helper_commit_tail_rpm+0x44/0x8c
>> [    7.228111]  drm_atomic_helper_commit_tail_rpm from
>> commit_tail+0x9c/0x184
>> [    7.228124]  commit_tail from drm_atomic_helper_commit+0x164/0x18c
>> [    7.228137]  drm_atomic_helper_commit from drm_atomic_commit+0xb0/0xe8
>> [    7.228153]  drm_atomic_commit from
>> drm_client_modeset_commit_atomic+0x240/0x288
>> [    7.228166]  drm_client_modeset_commit_atomic from
>> drm_client_modeset_commit_locked+0x60/0x1cc
>> [    7.228174]  drm_client_modeset_commit_locked from
>> drm_client_modeset_commit+0x24/0x40
>> [    7.228183]  drm_client_modeset_commit from
>> drm_fb_helper_set_par+0xb8/0xf8
>> [    7.228197]  drm_fb_helper_set_par from fbcon_init+0x2a0/0x534
>> [    7.228211]  fbcon_init from visual_init+0xc0/0x108
>> [    7.228224]  visual_init from do_bind_con_driver+0x1bc/0x3a8
>> [    7.228237]  do_bind_con_driver from do_take_over_console+0x134/0x1d4
>> [    7.228251]  do_take_over_console from do_fbcon_takeover+0x6c/0xcc
>> [    7.228264]  do_fbcon_takeover from fbcon_fb_registered+0x198/0x1a8
>> [    7.228277]  fbcon_fb_registered from register_framebuffer+0x1d0/0x268
>> [    7.228292]  register_framebuffer from
>> __drm_fb_helper_initial_config_and_unlock+0x358/0x578
>> [    7.228308]  __drm_fb_helper_initial_config_and_unlock from
>> drm_fbdev_client_hotplug+0x6c/0xa8
>> [    7.228322]  drm_fbdev_client_hotplug from
>> drm_fbdev_generic_setup+0x84/0x174
>> [    7.228335]  drm_fbdev_generic_setup from
>> rockchip_drm_bind+0x1dc/0x200
>> [    7.228349]  rockchip_drm_bind from
>> try_to_bring_up_aggregate_device+0x200/0x2d8
>> [    7.228368]  try_to_bring_up_aggregate_device from
>> component_master_add_with_match+0xc4/0xf8
>> [    7.228380]  component_master_add_with_match from
>> rockchip_drm_platform_probe+0x150/0x254
>> [    7.228392]  rockchip_drm_platform_probe from platform_probe+0x5c/0xb8
>> [    7.228405]  platform_probe from really_probe+0xe0/0x400
>> [    7.228417]  really_probe from __driver_probe_device+0x9c/0x1fc
>> [    7.228431]  __driver_probe_device from driver_probe_device+0x30/0xc0
>> [    7.228445]  driver_probe_device from
>> __device_attach_driver+0xa8/0x120
>> [    7.228460]  __device_attach_driver from bus_for_each_drv+0x84/0xdc
>> [    7.228474]  bus_for_each_drv from __device_attach+0xb0/0x20c
>> [    7.228488]  __device_attach from bus_probe_device+0x8c/0x90
>> [    7.228502]  bus_probe_device from deferred_probe_work_func+0x98/0xe0
>> [    7.228515]  deferred_probe_work_func from
>> process_one_work+0x290/0x740
>> [    7.228528]  process_one_work from worker_thread+0x54/0x518
>> [    7.228536]  worker_thread from kthread+0xf0/0x110
>> [    7.228548]  kthread from ret_from_fork+0x14/0x2c
>> [    7.228561] Exception stack(0xf084dfb0 to 0xf084dff8)
>> [    7.228567] dfa0:                                     00000000
>> 00000000 00000000 00000000
>> [    7.228573] dfc0: 00000000 00000000 00000000 00000000 00000000
>> 00000000 00000000 00000000
>> [    7.228579] dfe0: 00000000 00000000 00000000 00000000 00000013
>> 00000000
>> [    7.228585] irq event stamp: 138379
>> [    7.228592] hardirqs last  enabled at (138385): [<c03c42ac>]
>> vprintk_emit+0x330/0x354
>> [    7.228606] hardirqs last disabled at (138390): [<c03c4268>]
>> vprintk_emit+0x2ec/0x354
>> [    7.228617] softirqs last  enabled at (137258): [<c03016ac>]
>> __do_softirq+0x2f8/0x548
>> [    7.228628] softirqs last disabled at (137253): [<c0350218>]
>> __irq_exit_rcu+0x14c/0x170
>> [    7.228639] ---[ end trace 0000000000000000 ]---
>>
>> (complete log attached)
>>
>> I'm not sure how to debug this. From the callstack I can see that
>> __iommu_group_set_domain is getting a NULL new_domain which triggers
>> a call to __iommu_group_do_set_platform_dma which is WARNing because
>> there is no set_platform_dma_ops callback. The NULL new_domain is
>> because group->default_domain is NULL in __iommu_group_set_core_domain.
>>
>> So is the logic in the first patch that there is a default domain
>> incorrect for this rockchip driver? Or is this callpath just hitting
>> the function before the default domain can be created?
>>
>> Help debugging this would be appreciated - I'm not very familiar
>> with this area of code.
>
> It's the same thing Marek hit on Exynos: drivers which run on 32-bit Arm
> need a .set_platform_dma op, regardless of whether they support default
> domains on other platforms which use those.

Ah, that seems to be it. I'll post a patch.

Thanks,

Steve