2024-01-25 12:09:18

by Shivaprasad G Bhat

[permalink] [raw]
Subject: [PATCH 0/2] powerpc: iommu: Fix the vfio-pci bind and unbind issues

On PowerNV and pSeries machines, the bind and unbind of vfio-pci is
broken with the below commits.

2ad56efa80db ("powerpc/iommu: Setup a default domain and remove set_platform_dma_ops")
0f6a90436a57 ("iommu: Do not use IOMMU_DOMAIN_DMA if CONFIG_IOMMU_DMA is not enabled")

Details of the same are in the following patch descriptions.

---

Shivaprasad G Bhat (2):
powerpc: iommu: Bring back table group release_ownership() call
iommu: Fix the domain type checks when default_domain is set


arch/powerpc/kernel/iommu.c | 16 +++++++++++++---
drivers/iommu/iommu.c | 14 ++++++++++----
2 files changed, 23 insertions(+), 7 deletions(-)

--
Signature



2024-01-25 12:09:27

by Shivaprasad G Bhat

[permalink] [raw]
Subject: [PATCH 1/2] powerpc: iommu: Bring back table group release_ownership() call

The commit 2ad56efa80db ("powerpc/iommu: Setup a default domain and
remove set_platform_dma_ops") refactored the code removing the
set_platform_dma_ops(). It missed out the table group
release_ownership() call which would have got called otherwise
during the guest shutdown via vfio_group_detach_container(). On
PPC64, this particular call actually sets up the 32-bit TCE table,
and enables the 64-bit DMA bypass etc. Now after guest shutdown,
the subsequent host driver (e.g megaraid-sas) probe post unbind
from vfio-pci fails like,

megaraid_sas 0031:01:00.0: Warning: IOMMU dma not supported: mask 0x7fffffffffffffff, table unavailable
megaraid_sas 0031:01:00.0: Warning: IOMMU dma not supported: mask 0xffffffff, table unavailable
megaraid_sas 0031:01:00.0: Failed to set DMA mask
megaraid_sas 0031:01:00.0: Failed from megasas_init_fw 6539

The patch brings back the call to table_group release_ownership()
call when switching back to PLATFORM domain.

Fixes: 2ad56efa80db ("powerpc/iommu: Setup a default domain and remove set_platform_dma_ops")
Signed-off-by: Shivaprasad G Bhat <[email protected]>
---
arch/powerpc/kernel/iommu.c | 16 +++++++++++++---
1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c
index ebe259bdd462..ac7df43fa7ef 100644
--- a/arch/powerpc/kernel/iommu.c
+++ b/arch/powerpc/kernel/iommu.c
@@ -1296,9 +1296,19 @@ spapr_tce_platform_iommu_attach_dev(struct iommu_domain *platform_domain,
if (!grp)
return -ENODEV;

- table_group = iommu_group_get_iommudata(grp);
- ret = table_group->ops->take_ownership(table_group);
- iommu_group_put(grp);
+ if (platform_domain->type == IOMMU_DOMAIN_PLATFORM) {
+ ret = 0;
+ table_group = iommu_group_get_iommudata(grp);
+ /*
+ * The domain being set to PLATFORM from earlier
+ * BLOCKED. The table_group ownership has to be released.
+ */
+ table_group->ops->release_ownership(table_group);
+ } else if (platform_domain->type == IOMMU_DOMAIN_BLOCKED) {
+ table_group = iommu_group_get_iommudata(grp);
+ ret = table_group->ops->take_ownership(table_group);
+ iommu_group_put(grp);
+ }

return ret;
}



2024-01-25 12:09:45

by Shivaprasad G Bhat

[permalink] [raw]
Subject: [PATCH 2/2] iommu: Fix the domain type checks when default_domain is set

On PPC64, the iommu_ops.def_domain_type() is not defined and
CONFIG_IOMMU_DMA not enabled. With commit 0f6a90436a57 ("iommu: Do not
use IOMMU_DOMAIN_DMA if CONFIG_IOMMU_DMA is not enabled"), the
iommu_get_default_domain_type() return IOMMU_DOMAIN_IDENTITY. With
commit 2ad56efa80db ("powerpc/iommu: Setup a default domain and remove
set_platform_dma_ops"), the defaule_domain is set wih the type being
PLATFORM. With these two changes, iommu_group_alloc_default_domain()
ends up returning the NULL(with recent changes, ERR_PTR(-EINVAL))
leading to iommu_probe_device() failure and the device has no
iommu_group set in effect. Subsequently, the bind to vfio(VFIO_IOMMU)
fail as the iommu_group is not set for the device.

Make the iommu_get_default_domain_type() to take default_domain->type
into consideration along with default_domain_type() and fix
iommu_group_alloc_default_domain() to not error out if the requested
type is same as default domain type.

Fixes: 2ad56efa80db ("powerpc/iommu: Setup a default domain and remove set_platform_dma_ops")
Fixes: 0f6a90436a57 ("iommu: Do not use IOMMU_DOMAIN_DMA if CONFIG_IOMMU_DMA is not enabled")
Signed-off-by: Shivaprasad G Bhat <[email protected]>
---
drivers/iommu/iommu.c | 14 ++++++++++----
1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 68e648b55767..04083a1d9f7e 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -135,6 +135,8 @@ static struct group_device *iommu_group_alloc_device(struct iommu_group *group,
struct device *dev);
static void __iommu_group_free_device(struct iommu_group *group,
struct group_device *grp_dev);
+static int iommu_get_def_domain_type(struct iommu_group *group,
+ struct device *dev, int cur_type);

#define IOMMU_GROUP_ATTR(_name, _mode, _show, _store) \
struct iommu_group_attribute iommu_group_attr_##_name = \
@@ -1788,7 +1790,8 @@ __iommu_group_alloc_default_domain(struct iommu_group *group, int req_type)
static struct iommu_domain *
iommu_group_alloc_default_domain(struct iommu_group *group, int req_type)
{
- const struct iommu_ops *ops = dev_iommu_ops(iommu_group_first_dev(group));
+ struct device *dev = iommu_group_first_dev(group);
+ const struct iommu_ops *ops = dev_iommu_ops(dev);
struct iommu_domain *dom;

lockdep_assert_held(&group->mutex);
@@ -1799,7 +1802,7 @@ iommu_group_alloc_default_domain(struct iommu_group *group, int req_type)
* domain. Do not use in new drivers.
*/
if (ops->default_domain) {
- if (req_type)
+ if (iommu_get_def_domain_type(group, dev, req_type) != req_type)
return ERR_PTR(-EINVAL);
return ops->default_domain;
}
@@ -1871,10 +1874,13 @@ static int iommu_get_def_domain_type(struct iommu_group *group,
const struct iommu_ops *ops = dev_iommu_ops(dev);
int type;

- if (!ops->def_domain_type)
+ if (ops->def_domain_type)
+ type = ops->def_domain_type(dev);
+ else if (group->default_domain)
+ type = group->default_domain->type;
+ else
return cur_type;

- type = ops->def_domain_type(dev);
if (!type || cur_type == type)
return cur_type;
if (!cur_type)



2024-01-25 15:53:10

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH 2/2] iommu: Fix the domain type checks when default_domain is set

On Thu, Jan 25, 2024 at 06:08:52AM -0600, Shivaprasad G Bhat wrote:
> On PPC64, the iommu_ops.def_domain_type() is not defined and
> CONFIG_IOMMU_DMA not enabled. With commit 0f6a90436a57 ("iommu: Do not
> use IOMMU_DOMAIN_DMA if CONFIG_IOMMU_DMA is not enabled"), the
> iommu_get_default_domain_type() return IOMMU_DOMAIN_IDENTITY. With
> commit 2ad56efa80db ("powerpc/iommu: Setup a default domain and remove
> set_platform_dma_ops"), the defaule_domain is set wih the type being
> PLATFORM. With these two changes, iommu_group_alloc_default_domain()
> ends up returning the NULL(with recent changes, ERR_PTR(-EINVAL))
> leading to iommu_probe_device() failure and the device has no
> iommu_group set in effect. Subsequently, the bind to vfio(VFIO_IOMMU)
> fail as the iommu_group is not set for the device.
>
> Make the iommu_get_default_domain_type() to take default_domain->type
> into consideration along with default_domain_type() and fix
> iommu_group_alloc_default_domain() to not error out if the requested
> type is same as default domain type.
>
> Fixes: 2ad56efa80db ("powerpc/iommu: Setup a default domain and remove set_platform_dma_ops")
> Fixes: 0f6a90436a57 ("iommu: Do not use IOMMU_DOMAIN_DMA if CONFIG_IOMMU_DMA is not enabled")
> Signed-off-by: Shivaprasad G Bhat <[email protected]>
> ---
> drivers/iommu/iommu.c | 14 ++++++++++----
> 1 file changed, 10 insertions(+), 4 deletions(-)

Are you OK with this version?

https://lore.kernel.org/linux-iommu/[email protected]/

?

I think it does the same thing

Jason

2024-01-25 15:56:44

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH 1/2] powerpc: iommu: Bring back table group release_ownership() call

On Thu, Jan 25, 2024 at 06:08:39AM -0600, Shivaprasad G Bhat wrote:
> The commit 2ad56efa80db ("powerpc/iommu: Setup a default domain and
> remove set_platform_dma_ops") refactored the code removing the
> set_platform_dma_ops(). It missed out the table group
> release_ownership() call which would have got called otherwise
> during the guest shutdown via vfio_group_detach_container(). On
> PPC64, this particular call actually sets up the 32-bit TCE table,
> and enables the 64-bit DMA bypass etc. Now after guest shutdown,
> the subsequent host driver (e.g megaraid-sas) probe post unbind
> from vfio-pci fails like,
>
> megaraid_sas 0031:01:00.0: Warning: IOMMU dma not supported: mask 0x7fffffffffffffff, table unavailable
> megaraid_sas 0031:01:00.0: Warning: IOMMU dma not supported: mask 0xffffffff, table unavailable
> megaraid_sas 0031:01:00.0: Failed to set DMA mask
> megaraid_sas 0031:01:00.0: Failed from megasas_init_fw 6539
>
> The patch brings back the call to table_group release_ownership()
> call when switching back to PLATFORM domain.
>
> Fixes: 2ad56efa80db ("powerpc/iommu: Setup a default domain and remove set_platform_dma_ops")
> Signed-off-by: Shivaprasad G Bhat <[email protected]>
> ---
> arch/powerpc/kernel/iommu.c | 16 +++++++++++++---
> 1 file changed, 13 insertions(+), 3 deletions(-)
>
> diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c
> index ebe259bdd462..ac7df43fa7ef 100644
> --- a/arch/powerpc/kernel/iommu.c
> +++ b/arch/powerpc/kernel/iommu.c
> @@ -1296,9 +1296,19 @@ spapr_tce_platform_iommu_attach_dev(struct iommu_domain *platform_domain,
> if (!grp)
> return -ENODEV;
>
> - table_group = iommu_group_get_iommudata(grp);
> - ret = table_group->ops->take_ownership(table_group);
> - iommu_group_put(grp);
> + if (platform_domain->type == IOMMU_DOMAIN_PLATFORM) {
> + ret = 0;
> + table_group = iommu_group_get_iommudata(grp);
> + /*
> + * The domain being set to PLATFORM from earlier
> + * BLOCKED. The table_group ownership has to be released.
> + */
> + table_group->ops->release_ownership(table_group);
> + } else if (platform_domain->type == IOMMU_DOMAIN_BLOCKED) {
> + table_group = iommu_group_get_iommudata(grp);
> + ret = table_group->ops->take_ownership(table_group);
> + iommu_group_put(grp);
> + }

Sure, but please split the function, don't test on the
platform->domain_type.

Also, is there any chance someone can work on actually fixing this to
be a proper iommu driver? I think that will become important for power
to use the common dma_iommu code in the next year...

Sort of like this:

diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c
index ebe259bdd46298..0d6a7fea2bd9a5 100644
--- a/arch/powerpc/kernel/iommu.c
+++ b/arch/powerpc/kernel/iommu.c
@@ -1287,20 +1287,20 @@ spapr_tce_platform_iommu_attach_dev(struct iommu_domain *platform_domain,
struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
struct iommu_group *grp = iommu_group_get(dev);
struct iommu_table_group *table_group;
- int ret = -EINVAL;

/* At first attach the ownership is already set */
if (!domain)
return 0;

- if (!grp)
- return -ENODEV;
-
table_group = iommu_group_get_iommudata(grp);
- ret = table_group->ops->take_ownership(table_group);
+ /*
+ * The domain being set to PLATFORM from earlier
+ * BLOCKED. The table_group ownership has to be released.
+ */
+ table_group->ops->release_ownership(table_group);
iommu_group_put(grp);

- return ret;
+ return 0
}

static const struct iommu_domain_ops spapr_tce_platform_domain_ops = {
@@ -1312,13 +1312,33 @@ static struct iommu_domain spapr_tce_platform_domain = {
.ops = &spapr_tce_platform_domain_ops,
};

-static struct iommu_domain spapr_tce_blocked_domain = {
- .type = IOMMU_DOMAIN_BLOCKED,
+static int
+spapr_tce_platform_iommu_blocked_dev(struct iommu_domain *platform_domain,
+ struct device *dev)
+{
+ struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
+ struct iommu_group *grp = iommu_group_get(dev);
+ struct iommu_table_group *table_group;
+ int ret = -EINVAL;
+
/*
* FIXME: SPAPR mixes blocked and platform behaviors, the blocked domain
* also sets the dma_api ops
*/
- .ops = &spapr_tce_platform_domain_ops,
+ table_group = iommu_group_get_iommudata(grp);
+ ret = table_group->ops->take_ownership(table_group);
+ iommu_group_put(grp);
+
+ return ret;
+}
+
+static const struct iommu_domain_ops spapr_tce_blocked_domain_ops = {
+ .attach_dev = spapr_tce_blocked_iommu_attach_dev,
+};
+
+static struct iommu_domain spapr_tce_blocked_domain = {
+ .type = IOMMU_DOMAIN_BLOCKED,
+ .ops = &spapr_tce_blocked_domain_ops,
};

static bool spapr_tce_iommu_capable(struct device *dev, enum iommu_cap cap)

2024-01-26 15:16:37

by Shivaprasad G Bhat

[permalink] [raw]
Subject: Re: [PATCH 1/2] powerpc: iommu: Bring back table group release_ownership() call


On 1/25/24 21:20, Jason Gunthorpe wrote:
> On Thu, Jan 25, 2024 at 06:08:39AM -0600, Shivaprasad G Bhat wrote:
>> The commit 2ad56efa80db ("powerpc/iommu: Setup a default domain and
[snip]
>> + /*
>> + * The domain being set to PLATFORM from earlier
>> + * BLOCKED. The table_group ownership has to be released.
>> + */
>> + table_group->ops->release_ownership(table_group);
>> + } else if (platform_domain->type == IOMMU_DOMAIN_BLOCKED) {
>> + table_group = iommu_group_get_iommudata(grp);
>> + ret = table_group->ops->take_ownership(table_group);
>> + iommu_group_put(grp);
>> + }
> Sure, but please split the function, don't test on the
> platform->domain_type.
Sure.
> Also, is there any chance someone can work on actually fixing this to
> be a proper iommu driver? I think that will become important for power
> to use the common dma_iommu code in the next year...
We are looking into it.
> Sort of like this:
>
> diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c
> index ebe259bdd46298..0d6a7fea2bd9a5 100644
> --- a/arch/powerpc/kernel/iommu.c
> +++ b/arch/powerpc/kernel/iommu.c
> @@ -1287,20 +1287,20 @@ spapr_tce_platform_iommu_attach_dev(struct iommu_domain *platform_domain,
> struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
> struct iommu_group *grp = iommu_group_get(dev);
[snip]
> +static const struct iommu_domain_ops spapr_tce_blocked_domain_ops = {
> + .attach_dev = spapr_tce_blocked_iommu_attach_dev,
> +};
> +
> +static struct iommu_domain spapr_tce_blocked_domain = {
> + .type = IOMMU_DOMAIN_BLOCKED,
> + .ops = &spapr_tce_blocked_domain_ops,
> };
>
> static bool spapr_tce_iommu_capable(struct device *dev, enum iommu_cap cap)

I have posted the next version as suggested.

Thanks for the quick review!

Regards,

Shivaprasad



2024-01-26 15:17:14

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH 1/2] powerpc: iommu: Bring back table group release_ownership() call

On Fri, Jan 26, 2024 at 08:43:12PM +0530, Shivaprasad G Bhat wrote:
> > Also, is there any chance someone can work on actually fixing this to
> > be a proper iommu driver? I think that will become important for power
> > to use the common dma_iommu code in the next year...
> We are looking into it.

Okay, let me know, I can possibly help make parts of this happen

power is the last still-current architecture to be outside the modern
IOMMU and DMA API design and I'm going to start proposing things that
will not be efficient on power because of this.

I think a basic iommu driver using the dma API would not be so hard.

I don't know what to do about the SPAPR VFIO mess though. :(

Jason

2024-01-26 15:20:27

by Shivaprasad G Bhat

[permalink] [raw]
Subject: Re: [PATCH 2/2] iommu: Fix the domain type checks when default_domain is set

On 1/25/24 21:22, Jason Gunthorpe wrote:
> On Thu, Jan 25, 2024 at 06:08:52AM -0600, Shivaprasad G Bhat wrote:
>> On PPC64, the iommu_ops.def_domain_type() is not defined and
>> CONFIG_IOMMU_DMA not enabled. With commit 0f6a90436a57 ("iommu: Do not
>> use IOMMU_DOMAIN_DMA if CONFIG_IOMMU_DMA is not enabled"), the
>> iommu_get_default_domain_type() return IOMMU_DOMAIN_IDENTITY. With
>> commit 2ad56efa80db ("powerpc/iommu: Setup a default domain and remove
>> set_platform_dma_ops"), the defaule_domain is set wih the type being
>> PLATFORM. With these two changes, iommu_group_alloc_default_domain()
>> ends up returning the NULL(with recent changes, ERR_PTR(-EINVAL))
>> leading to iommu_probe_device() failure and the device has no
>> iommu_group set in effect. Subsequently, the bind to vfio(VFIO_IOMMU)
>> fail as the iommu_group is not set for the device.
>>
>> Make the iommu_get_default_domain_type() to take default_domain->type
>> into consideration along with default_domain_type() and fix
>> iommu_group_alloc_default_domain() to not error out if the requested
>> type is same as default domain type.
>>
>> Fixes: 2ad56efa80db ("powerpc/iommu: Setup a default domain and remove set_platform_dma_ops")
>> Fixes: 0f6a90436a57 ("iommu: Do not use IOMMU_DOMAIN_DMA if CONFIG_IOMMU_DMA is not enabled")
>> Signed-off-by: Shivaprasad G Bhat <[email protected]>
>> ---
>> drivers/iommu/iommu.c | 14 ++++++++++----
>> 1 file changed, 10 insertions(+), 4 deletions(-)
> Are you OK with this version?
>
> https://lore.kernel.org/linux-iommu/[email protected]/
>
> ?
>
> I think it does the same thing

Yes, This works. I see a very minor difference of default_domain->type
is given

precedence over def_domain_type(). Please keep me CC when you post this
fix, I'll

test it with any(?) other changes if coming along with it.


Thanks,

Shivaprasad

> Jason

2024-01-26 15:37:00

by Timothy Pearson

[permalink] [raw]
Subject: Re: [PATCH 1/2] powerpc: iommu: Bring back table group release_ownership() call



----- Original Message -----
> From: "Jason Gunthorpe" <[email protected]>
> To: "Shivaprasad G Bhat" <[email protected]>
> Cc: [email protected], "linuxppc-dev" <[email protected]>, "linux-kernel"
> <[email protected]>, "Michael Ellerman" <[email protected]>, "npiggin" <[email protected]>, "christophe
> leroy" <[email protected]>, "aneesh kumar" <[email protected]>, "naveen n rao"
> <[email protected]>, [email protected], "Timothy Pearson" <[email protected]>, [email protected],
> [email protected], "Greg Kroah-Hartman" <[email protected]>, [email protected],
> [email protected]
> Sent: Friday, January 26, 2024 9:17:01 AM
> Subject: Re: [PATCH 1/2] powerpc: iommu: Bring back table group release_ownership() call

> On Fri, Jan 26, 2024 at 08:43:12PM +0530, Shivaprasad G Bhat wrote:
>> > Also, is there any chance someone can work on actually fixing this to
>> > be a proper iommu driver? I think that will become important for power
>> > to use the common dma_iommu code in the next year...
>> We are looking into it.
>
> Okay, let me know, I can possibly help make parts of this happen
>
> power is the last still-current architecture to be outside the modern
> IOMMU and DMA API design and I'm going to start proposing things that
> will not be efficient on power because of this.

I can get development resources on this fairly rapidly, including testing. We should figure out the best way forward and how to deal with the VFIO side of things, even if that's a rewrite at the end of the day the machine-specific codebase isn't *that* large for our two target flavors (64-bit PowerNV and 64-bit pSeries).

> I think a basic iommu driver using the dma API would not be so hard.
>
> I don't know what to do about the SPAPR VFIO mess though. :(
>
> Jason

2024-01-26 15:38:16

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH 1/2] powerpc: iommu: Bring back table group release_ownership() call

On Fri, Jan 26, 2024 at 09:29:55AM -0600, Timothy Pearson wrote:
> > On Fri, Jan 26, 2024 at 08:43:12PM +0530, Shivaprasad G Bhat wrote:
> >> > Also, is there any chance someone can work on actually fixing this to
> >> > be a proper iommu driver? I think that will become important for power
> >> > to use the common dma_iommu code in the next year...
> >> We are looking into it.
> >
> > Okay, let me know, I can possibly help make parts of this happen
> >
> > power is the last still-current architecture to be outside the modern
> > IOMMU and DMA API design and I'm going to start proposing things that
> > will not be efficient on power because of this.
>
> I can get development resources on this fairly rapidly, including
> testing. We should figure out the best way forward and how to deal
> with the VFIO side of things, even if that's a rewrite at the end of
> the day the machine-specific codebase isn't *that* large for our two
> target flavors (64-bit PowerNV and 64-bit pSeries).

I have a feeling the way forward is to just start a power driver under
drivers/iommu/ and use kconfig to make the user exclusively select
either the legacy arch or the modern iommu.

Get that working to a level where dma_iommu is happy.

Get iommufd support in the new driver good enough to run your
application.

Just forget about the weird KVM and SPAPR stuff, leave it under the
kconfig of the old code and nobody will run it. Almost nobody already
runs it, apparently.

Remove it in a few years

From what I remember the code under the arch directory is already very
nearly almost an iommu driver. I think someone could fairly quickly
get to something working and using dma_iommu.c. If s390 is any
experience there is some benchmarking and tweaking to get performance
equal to the arch's tweaked dma_iommu copy.

Jason

2024-01-26 15:40:10

by Timothy Pearson

[permalink] [raw]
Subject: Re: [PATCH 1/2] powerpc: iommu: Bring back table group release_ownership() call



----- Original Message -----
> From: "Jason Gunthorpe" <[email protected]>
> To: "Timothy Pearson" <[email protected]>
> Cc: "Shivaprasad G Bhat" <[email protected]>, "iommu" <[email protected]>, "linuxppc-dev"
> <[email protected]>, "linux-kernel" <[email protected]>, "Michael Ellerman"
> <[email protected]>, "npiggin" <[email protected]>, "christophe leroy" <[email protected]>, "aneesh kumar"
> <[email protected]>, "naveen n rao" <[email protected]>, "jroedel" <[email protected]>, "aik"
> <[email protected]>, "bgray" <[email protected]>, "Greg Kroah-Hartman" <[email protected]>, "gbatra"
> <[email protected]>, "vaibhav" <[email protected]>
> Sent: Friday, January 26, 2024 9:38:06 AM
> Subject: Re: [PATCH 1/2] powerpc: iommu: Bring back table group release_ownership() call

> On Fri, Jan 26, 2024 at 09:29:55AM -0600, Timothy Pearson wrote:
>> > On Fri, Jan 26, 2024 at 08:43:12PM +0530, Shivaprasad G Bhat wrote:
>> >> > Also, is there any chance someone can work on actually fixing this to
>> >> > be a proper iommu driver? I think that will become important for power
>> >> > to use the common dma_iommu code in the next year...
>> >> We are looking into it.
>> >
>> > Okay, let me know, I can possibly help make parts of this happen
>> >
>> > power is the last still-current architecture to be outside the modern
>> > IOMMU and DMA API design and I'm going to start proposing things that
>> > will not be efficient on power because of this.
>>
>> I can get development resources on this fairly rapidly, including
>> testing. We should figure out the best way forward and how to deal
>> with the VFIO side of things, even if that's a rewrite at the end of
>> the day the machine-specific codebase isn't *that* large for our two
>> target flavors (64-bit PowerNV and 64-bit pSeries).
>
> I have a feeling the way forward is to just start a power driver under
> drivers/iommu/ and use kconfig to make the user exclusively select
> either the legacy arch or the modern iommu.
>
> Get that working to a level where dma_iommu is happy.
>
> Get iommufd support in the new driver good enough to run your
> application.
>
> Just forget about the weird KVM and SPAPR stuff, leave it under the
> kconfig of the old code and nobody will run it. Almost nobody already
> runs it, apparently.

We actually use QEMU/KVM/VFIO extensively at Raptor, so need the support and need it to be performant...

> Remove it in a few years
>
> From what I remember the code under the arch directory is already very
> nearly almost an iommu driver. I think someone could fairly quickly
> get to something working and using dma_iommu.c. If s390 is any
> experience there is some benchmarking and tweaking to get performance
> equal to the arch's tweaked dma_iommu copy.
>
> Jason

2024-01-26 15:45:06

by Timothy Pearson

[permalink] [raw]
Subject: Re: [PATCH 1/2] powerpc: iommu: Bring back table group release_ownership() call



----- Original Message -----
> From: "Timothy Pearson" <[email protected]>
> To: "Jason Gunthorpe" <[email protected]>
> Cc: "Timothy Pearson" <[email protected]>, "Shivaprasad G Bhat" <[email protected]>, "iommu"
> <[email protected]>, "linuxppc-dev" <[email protected]>, "linux-kernel" <[email protected]>,
> "Michael Ellerman" <[email protected]>, "npiggin" <[email protected]>, "christophe leroy"
> <[email protected]>, "aneesh kumar" <[email protected]>, "naveen n rao" <[email protected]>,
> "jroedel" <[email protected]>, "aik" <[email protected]>, "bgray" <[email protected]>, "Greg Kroah-Hartman"
> <[email protected]>, "gbatra" <[email protected]>, "vaibhav" <[email protected]>
> Sent: Friday, January 26, 2024 9:39:56 AM
> Subject: Re: [PATCH 1/2] powerpc: iommu: Bring back table group release_ownership() call

> ----- Original Message -----
>> From: "Jason Gunthorpe" <[email protected]>
>> To: "Timothy Pearson" <[email protected]>
>> Cc: "Shivaprasad G Bhat" <[email protected]>, "iommu" <[email protected]>,
>> "linuxppc-dev"
>> <[email protected]>, "linux-kernel" <[email protected]>,
>> "Michael Ellerman"
>> <[email protected]>, "npiggin" <[email protected]>, "christophe leroy"
>> <[email protected]>, "aneesh kumar"
>> <[email protected]>, "naveen n rao" <[email protected]>,
>> "jroedel" <[email protected]>, "aik"
>> <[email protected]>, "bgray" <[email protected]>, "Greg Kroah-Hartman"
>> <[email protected]>, "gbatra"
>> <[email protected]>, "vaibhav" <[email protected]>
>> Sent: Friday, January 26, 2024 9:38:06 AM
>> Subject: Re: [PATCH 1/2] powerpc: iommu: Bring back table group
>> release_ownership() call
>
>> On Fri, Jan 26, 2024 at 09:29:55AM -0600, Timothy Pearson wrote:
>>> > On Fri, Jan 26, 2024 at 08:43:12PM +0530, Shivaprasad G Bhat wrote:
>>> >> > Also, is there any chance someone can work on actually fixing this to
>>> >> > be a proper iommu driver? I think that will become important for power
>>> >> > to use the common dma_iommu code in the next year...
>>> >> We are looking into it.
>>> >
>>> > Okay, let me know, I can possibly help make parts of this happen
>>> >
>>> > power is the last still-current architecture to be outside the modern
>>> > IOMMU and DMA API design and I'm going to start proposing things that
>>> > will not be efficient on power because of this.
>>>
>>> I can get development resources on this fairly rapidly, including
>>> testing. We should figure out the best way forward and how to deal
>>> with the VFIO side of things, even if that's a rewrite at the end of
>>> the day the machine-specific codebase isn't *that* large for our two
>>> target flavors (64-bit PowerNV and 64-bit pSeries).
>>
>> I have a feeling the way forward is to just start a power driver under
>> drivers/iommu/ and use kconfig to make the user exclusively select
>> either the legacy arch or the modern iommu.
>>
>> Get that working to a level where dma_iommu is happy.
>>
>> Get iommufd support in the new driver good enough to run your
>> application.
>>
>> Just forget about the weird KVM and SPAPR stuff, leave it under the
>> kconfig of the old code and nobody will run it. Almost nobody already
>> runs it, apparently.
>
> We actually use QEMU/KVM/VFIO extensively at Raptor, so need the support and
> need it to be performant...

Never mind, I can't read this morning. :) You did say iommufd support, which gives the VFIO passthrough functionality. I think this is a reasonable approach, and will discuss further internally this afternoon.

2024-01-26 15:45:19

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH 1/2] powerpc: iommu: Bring back table group release_ownership() call

On Fri, Jan 26, 2024 at 09:39:56AM -0600, Timothy Pearson wrote:
> > Just forget about the weird KVM and SPAPR stuff, leave it under the
> > kconfig of the old code and nobody will run it. Almost nobody already
> > runs it, apparently.
>
> We actually use QEMU/KVM/VFIO extensively at Raptor, so need the
> support and need it to be performant...

I wonder if you alone are the "almost" :)

The KVM entanglement was hairy and scary. I never did figure out what
was really going on there. Maybe you don't need all of it and can be
successful with a more typical iommu working model?

Suggest to tackle it after getting the first parts done.

Jason