2024-01-13 18:17:30

by Eric Badger

[permalink] [raw]
Subject: [PATCH] iommu/vt-d: Check for non-NULL domain on device release

In the kdump kernel, the IOMMU will operate in deferred_attach mode. In
this mode, info->domain may not yet be assigned by the time the
release_device function is called, which leads to the following crash in
the crashkernel:

BUG: kernel NULL pointer dereference, address: 000000000000003c
...
RIP: 0010:do_raw_spin_lock+0xa/0xa0
...
_raw_spin_lock_irqsave+0x1b/0x30
intel_iommu_release_device+0x96/0x170
iommu_deinit_device+0x39/0xf0
__iommu_group_remove_device+0xa0/0xd0
iommu_bus_notifier+0x55/0xb0
notifier_call_chain+0x5a/0xd0
blocking_notifier_call_chain+0x41/0x60
bus_notify+0x34/0x50
device_del+0x269/0x3d0
pci_remove_bus_device+0x77/0x100
p2sb_bar+0xae/0x1d0
...
i801_probe+0x423/0x740

Signed-off-by: Eric Badger <[email protected]>
---
drivers/iommu/intel/iommu.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 6fb5f6fceea1..26e450259889 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -3750,7 +3750,6 @@ static void domain_context_clear(struct device_domain_info *info)
static void dmar_remove_one_dev_info(struct device *dev)
{
struct device_domain_info *info = dev_iommu_priv_get(dev);
- struct dmar_domain *domain = info->domain;
struct intel_iommu *iommu = info->iommu;
unsigned long flags;

@@ -3763,11 +3762,14 @@ static void dmar_remove_one_dev_info(struct device *dev)
domain_context_clear(info);
}

- spin_lock_irqsave(&domain->lock, flags);
+ if (!info->domain)
+ return;
+
+ spin_lock_irqsave(&info->domain->lock, flags);
list_del(&info->link);
- spin_unlock_irqrestore(&domain->lock, flags);
+ spin_unlock_irqrestore(&info->domain->lock, flags);

- domain_detach_iommu(domain, iommu);
+ domain_detach_iommu(info->domain, iommu);
info->domain = NULL;
}

--
2.34.1



2024-01-16 15:22:25

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH] iommu/vt-d: Check for non-NULL domain on device release

On Sat, Jan 13, 2024 at 10:17:13AM -0800, Eric Badger wrote:
> In the kdump kernel, the IOMMU will operate in deferred_attach mode. In
> this mode, info->domain may not yet be assigned by the time the
> release_device function is called, which leads to the following crash in
> the crashkernel:

This never worked right? But why are you getting to release in a crash
kernel in the first place, that seems kinda weird..

> BUG: kernel NULL pointer dereference, address: 000000000000003c
> ...
> RIP: 0010:do_raw_spin_lock+0xa/0xa0
> ...
> _raw_spin_lock_irqsave+0x1b/0x30
> intel_iommu_release_device+0x96/0x170
> iommu_deinit_device+0x39/0xf0
> __iommu_group_remove_device+0xa0/0xd0
> iommu_bus_notifier+0x55/0xb0
> notifier_call_chain+0x5a/0xd0
> blocking_notifier_call_chain+0x41/0x60
> bus_notify+0x34/0x50
> device_del+0x269/0x3d0
> pci_remove_bus_device+0x77/0x100
> p2sb_bar+0xae/0x1d0
> ...
> i801_probe+0x423/0x740
>
> Signed-off-by: Eric Badger <[email protected]>

It should have a fixes line, but I'm not sure what it is..

> ---
> drivers/iommu/intel/iommu.c | 10 ++++++----
> 1 file changed, 6 insertions(+), 4 deletions(-)

Unfortunately this issue is likely systemic in all the drivers.

Something I've been thinking about for a while now is to have the
option for the core code release to set the driver to a specific
releasing domain (probably a blocking or identity global static)

A lot of drivers are open coding this internally in their release
functions, like Intel. But it really doesn't deserve to be special.

Obviously if we structure things like that then this problem goes away
too. It looks to me like domain_context_clear_one() is doing BLOCKED
to me..

> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index 6fb5f6fceea1..26e450259889 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -3750,7 +3750,6 @@ static void domain_context_clear(struct device_domain_info *info)
> static void dmar_remove_one_dev_info(struct device *dev)
> {
> struct device_domain_info *info = dev_iommu_priv_get(dev);
> - struct dmar_domain *domain = info->domain;
> struct intel_iommu *iommu = info->iommu;
> unsigned long flags;
>
> @@ -3763,11 +3762,14 @@ static void dmar_remove_one_dev_info(struct device *dev)
> domain_context_clear(info);
> }
>
> - spin_lock_irqsave(&domain->lock, flags);
> + if (!info->domain)
> + return;
> +
> + spin_lock_irqsave(&info->domain->lock, flags);
> list_del(&info->link);
> - spin_unlock_irqrestore(&domain->lock, flags);
> + spin_unlock_irqrestore(&info->domain->lock, flags);
>
> - domain_detach_iommu(domain, iommu);
> + domain_detach_iommu(info->domain, iommu);
> info->domain = NULL;
> }

This function is almost a copy of device_block_translation(), with
some bugs added :\

Lu can they be made the same? It seems to me BLOCKED could always
clear the domain context, even in scalable mode?

Anyhow, my suggestion is more like:

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 15699fe5418e3d..6c683fd4bc05ec 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -3747,30 +3747,6 @@ static void domain_context_clear(struct device_domain_info *info)
&domain_context_clear_one_cb, info);
}

-static void dmar_remove_one_dev_info(struct device *dev)
-{
- struct device_domain_info *info = dev_iommu_priv_get(dev);
- struct dmar_domain *domain = info->domain;
- struct intel_iommu *iommu = info->iommu;
- unsigned long flags;
-
- if (!dev_is_real_dma_subdevice(info->dev)) {
- if (dev_is_pci(info->dev) && sm_supported(iommu))
- intel_pasid_tear_down_entry(iommu, info->dev,
- IOMMU_NO_PASID, false);
-
- iommu_disable_pci_caps(info);
- domain_context_clear(info);
- }
-
- spin_lock_irqsave(&domain->lock, flags);
- list_del(&info->link);
- spin_unlock_irqrestore(&domain->lock, flags);
-
- domain_detach_iommu(domain, iommu);
- info->domain = NULL;
-}
-
/*
* Clear the page table pointer in context or pasid table entries so that
* all DMA requests without PASID from the device are blocked. If the page
@@ -4284,7 +4260,6 @@ static void intel_iommu_release_device(struct device *dev)
{
struct device_domain_info *info = dev_iommu_priv_get(dev);

- dmar_remove_one_dev_info(dev);
intel_pasid_free_table(dev);
intel_iommu_debugfs_remove_dev(info);
kfree(info);
@@ -4743,6 +4718,7 @@ static const struct iommu_dirty_ops intel_dirty_ops = {

const struct iommu_ops intel_iommu_ops = {
.blocked_domain = &blocking_domain,
+ .release_domain = &blocking_domain,
.capable = intel_iommu_capable,
.hw_info = intel_iommu_hw_info,
.domain_alloc = intel_iommu_domain_alloc,
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 068f6508e57c9b..9fbbd83a82d4e9 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -442,6 +442,8 @@ static int iommu_init_device(struct device *dev, const struct iommu_ops *ops)
err_unlink:
iommu_device_unlink(iommu_dev, dev);
err_release:
+ if (ops->release_domain)
+ ops->release_domain->ops->attach_dev(ops->release_domain, dev);
if (ops->release_device)
ops->release_device(dev);
err_module_put:
@@ -461,6 +463,15 @@ static void iommu_deinit_device(struct device *dev)

iommu_device_unlink(dev->iommu->iommu_dev, dev);

+ /*
+ * If the iommu driver provides release_domain then the core code
+ * ensures that domain is attached prior to calling release_device.
+ * Drivers can use this to enforce a translation on the idle iommu.
+ * Usually the global static blocked_domain is a good choice.
+ */
+ if (ops->release_domain)
+ ops->release_domain->ops->attach_dev(ops->release_domain, dev);
+
/*
* release_device() must stop using any attached domain on the device.
* If there are still other devices in the group they are not effected
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index c34d0ef1703d68..1e8be165c878d8 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -511,6 +511,7 @@ struct iommu_ops {
struct module *owner;
struct iommu_domain *identity_domain;
struct iommu_domain *blocked_domain;
+ struct iommu_domain *release_domain;
struct iommu_domain *default_domain;
};


2024-01-17 01:59:30

by Robin Murphy

[permalink] [raw]
Subject: Re: [PATCH] iommu/vt-d: Check for non-NULL domain on device release

On 2024-01-16 3:22 pm, Jason Gunthorpe wrote:
> On Sat, Jan 13, 2024 at 10:17:13AM -0800, Eric Badger wrote:
>> In the kdump kernel, the IOMMU will operate in deferred_attach mode. In
>> this mode, info->domain may not yet be assigned by the time the
>> release_device function is called, which leads to the following crash in
>> the crashkernel:
>
> This never worked right? But why are you getting to release in a crash
> kernel in the first place, that seems kinda weird..
>
>> BUG: kernel NULL pointer dereference, address: 000000000000003c
>> ...
>> RIP: 0010:do_raw_spin_lock+0xa/0xa0
>> ...
>> _raw_spin_lock_irqsave+0x1b/0x30
>> intel_iommu_release_device+0x96/0x170
>> iommu_deinit_device+0x39/0xf0
>> __iommu_group_remove_device+0xa0/0xd0
>> iommu_bus_notifier+0x55/0xb0
>> notifier_call_chain+0x5a/0xd0
>> blocking_notifier_call_chain+0x41/0x60
>> bus_notify+0x34/0x50
>> device_del+0x269/0x3d0
>> pci_remove_bus_device+0x77/0x100
>> p2sb_bar+0xae/0x1d0
>> ...
>> i801_probe+0x423/0x740
>>
>> Signed-off-by: Eric Badger <[email protected]>
>
> It should have a fixes line, but I'm not sure what it is..
>
>> ---
>> drivers/iommu/intel/iommu.c | 10 ++++++----
>> 1 file changed, 6 insertions(+), 4 deletions(-)
>
> Unfortunately this issue is likely systemic in all the drivers.

All two of the drivers which support deferred attach, that is.

> Something I've been thinking about for a while now is to have the
> option for the core code release to set the driver to a specific
> releasing domain (probably a blocking or identity global static)
>
> A lot of drivers are open coding this internally in their release
> functions, like Intel. But it really doesn't deserve to be special.

Either way it shouldn't apply in this case, though. Crash kernels *are*
special. The whole point of deferred attach is that we don't disturb
anything unless we've got as far as definitely using a given default
domain (from which we can infer the relevant client device should have
been reset and quiesced). Thus regardless of why release might get
called, if a deferred attach never happened then the release should
still avoid touching any hardware configuration either.

I'd suggest using dev->iommu->attach_deferred as the condition rather
than a NULL domain, to be clear that it's the *only* valid reason to not
have a domain at this point.

Thanks,
Robin.

2024-01-17 02:22:46

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH] iommu/vt-d: Check for non-NULL domain on device release

On Wed, Jan 17, 2024 at 01:57:34AM +0000, Robin Murphy wrote:
> > > ---
> > > drivers/iommu/intel/iommu.c | 10 ++++++----
> > > 1 file changed, 6 insertions(+), 4 deletions(-)
> >
> > Unfortunately this issue is likely systemic in all the drivers.
>
> All two of the drivers which support deferred attach, that is.

We could call release_device on an unlikely error unwind path with a
NULL domain for all drivers.

> > Something I've been thinking about for a while now is to have the
> > option for the core code release to set the driver to a specific
> > releasing domain (probably a blocking or identity global static)
> >
> > A lot of drivers are open coding this internally in their release
> > functions, like Intel. But it really doesn't deserve to be special.
>
> Either way it shouldn't apply in this case, though. Crash kernels *are*
> special. The whole point of deferred attach is that we don't disturb
> anything unless we've got as far as definitely using a given default domain
> (from which we can infer the relevant client device should have been reset
> and quiesced).

If only it were so simple.. It assumes drivers are structured to reset
their devices using only MMIO before doing anything to setup DMA and
trigger an iommu attach. A lot of drivers aren't and this whole scheme
turns a bit sketchy.

Some devices can't even do it at all. eg mlx5 has the crashing kernel
quiet the device before passing over to the crash kernel because there
is no way beyond FLR to regain control of a mlx5 device.

> Thus regardless of why release might get called, if a
> deferred attach never happened then the release should still avoid touching
> any hardware configuration either.

Interesting point.. Makes sense to me

> I'd suggest using dev->iommu->attach_deferred as the condition rather than a
> NULL domain, to be clear that it's the *only* valid reason to not have a
> domain at this point.

So if we take this release_domain approach and we have the core code
not call it for the defered attach case then the driver's
release_device should just free memory allocated by probe and not
touch the data structures?

Then the idea is that drivers would only manipulate their STE/DTE/etc
under attach and never in probe/release.

Jason

2024-01-17 03:16:03

by Eric Badger

[permalink] [raw]
Subject: Re: [PATCH] iommu/vt-d: Check for non-NULL domain on device release

On Tue, Jan 16, 2024 at 11:22:15AM -0400, Jason Gunthorpe wrote:
> On Sat, Jan 13, 2024 at 10:17:13AM -0800, Eric Badger wrote:
> > In the kdump kernel, the IOMMU will operate in deferred_attach mode. In
> > this mode, info->domain may not yet be assigned by the time the
> > release_device function is called, which leads to the following crash in
> > the crashkernel:
>
> This never worked right? But why are you getting to release in a crash
> kernel in the first place, that seems kinda weird..

I first encountered this crash as part of an upgrade from 5.15.125 to
6.6.0. I'd guess this has not worked since p2sb_bar() was plumbed into
i801_probe(). Prior to that, there was similar code in i801_probe(), but
no call to pci_stop_and_remove_bus_device().

>
> > BUG: kernel NULL pointer dereference, address: 000000000000003c
> > ...
> > RIP: 0010:do_raw_spin_lock+0xa/0xa0
> > ...
> > _raw_spin_lock_irqsave+0x1b/0x30
> > intel_iommu_release_device+0x96/0x170
> > iommu_deinit_device+0x39/0xf0
> > __iommu_group_remove_device+0xa0/0xd0
> > iommu_bus_notifier+0x55/0xb0
> > notifier_call_chain+0x5a/0xd0
> > blocking_notifier_call_chain+0x41/0x60
> > bus_notify+0x34/0x50
> > device_del+0x269/0x3d0
> > pci_remove_bus_device+0x77/0x100
> > p2sb_bar+0xae/0x1d0
> > ...
> > i801_probe+0x423/0x740
> >
> > Signed-off-by: Eric Badger <[email protected]>
>
> It should have a fixes line, but I'm not sure what it is..

Starting from my known-working 5.15.125 base, I worked my way forward to
try to find the offenders. I think it's mainly these two:

dd8a25c557e1 ("iommu: Remove deferred attach check from __iommu_detach_device()")
586081d3f6b1 ("iommu/vt-d: Remove DEFER_DEVICE_DOMAIN_INFO")

Prior to the first one, we always end up doing a non-deferred attach on
the device, so it's not possible to have a not-yet-attached device at
release time.

The second one looks to be necessary to get the situation where we have
non-NULL info, but NULL info->domain.

Since there was a lot of refactoring done between 5.15.125 and 6.6.0, I
couldn't just cherry pick those two back to 5.15.125. But I was able to
cherry pick these two sequences with minimal manual massaging and that
reproed the crash in the crashkernel in intel_iommu_release_device() on
5.15.125:

a688b376fa5e iommu/vt-d: Remove DEFER_DEVICE_DOMAIN_INFO
c2ed7ea06b6b iommu/vt-d: Remove domain and devinfo mempool
d195dd05b950 iommu/vt-d: Remove iova_cache_get/put()
0822172fa474 iommu/vt-d: Remove finding domain in dmar_insert_one_dev_info()
443d995073fb iommu/vt-d: Remove intel_iommu::domains

f394cb40eaa7 iommu: Remove deferred attach check from __iommu_detach_device()
09d782454860 iommu: Remove unused argument in is_attach_deferred
35c672b9e7ed iommu: Add DMA ownership management interfaces
6b9638d70dff iommu: Use right way to retrieve iommu_ops

>
> > ---
> > drivers/iommu/intel/iommu.c | 10 ++++++----
> > 1 file changed, 6 insertions(+), 4 deletions(-)
>
> Unfortunately this issue is likely systemic in all the drivers.
>
> Something I've been thinking about for a while now is to have the
> option for the core code release to set the driver to a specific
> releasing domain (probably a blocking or identity global static)
>
> A lot of drivers are open coding this internally in their release
> functions, like Intel. But it really doesn't deserve to be special.
>
> Obviously if we structure things like that then this problem goes away
> too. It looks to me like domain_context_clear_one() is doing BLOCKED
> to me..
>
> > diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> > index 6fb5f6fceea1..26e450259889 100644
> > --- a/drivers/iommu/intel/iommu.c
> > +++ b/drivers/iommu/intel/iommu.c
> > @@ -3750,7 +3750,6 @@ static void domain_context_clear(struct device_domain_info *info)
> > static void dmar_remove_one_dev_info(struct device *dev)
> > {
> > struct device_domain_info *info = dev_iommu_priv_get(dev);
> > - struct dmar_domain *domain = info->domain;
> > struct intel_iommu *iommu = info->iommu;
> > unsigned long flags;
> >
> > @@ -3763,11 +3762,14 @@ static void dmar_remove_one_dev_info(struct device *dev)
> > domain_context_clear(info);
> > }
> >
> > - spin_lock_irqsave(&domain->lock, flags);
> > + if (!info->domain)
> > + return;
> > +
> > + spin_lock_irqsave(&info->domain->lock, flags);
> > list_del(&info->link);
> > - spin_unlock_irqrestore(&domain->lock, flags);
> > + spin_unlock_irqrestore(&info->domain->lock, flags);
> >
> > - domain_detach_iommu(domain, iommu);
> > + domain_detach_iommu(info->domain, iommu);
> > info->domain = NULL;
> > }
>
> This function is almost a copy of device_block_translation(), with
> some bugs added :\
>
> Lu can they be made the same? It seems to me BLOCKED could always
> clear the domain context, even in scalable mode?
>
> Anyhow, my suggestion is more like:
>
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index 15699fe5418e3d..6c683fd4bc05ec 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -3747,30 +3747,6 @@ static void domain_context_clear(struct device_domain_info *info)
> &domain_context_clear_one_cb, info);
> }
>
> -static void dmar_remove_one_dev_info(struct device *dev)
> -{
> - struct device_domain_info *info = dev_iommu_priv_get(dev);
> - struct dmar_domain *domain = info->domain;
> - struct intel_iommu *iommu = info->iommu;
> - unsigned long flags;
> -
> - if (!dev_is_real_dma_subdevice(info->dev)) {
> - if (dev_is_pci(info->dev) && sm_supported(iommu))
> - intel_pasid_tear_down_entry(iommu, info->dev,
> - IOMMU_NO_PASID, false);
> -
> - iommu_disable_pci_caps(info);
> - domain_context_clear(info);
> - }
> -
> - spin_lock_irqsave(&domain->lock, flags);
> - list_del(&info->link);
> - spin_unlock_irqrestore(&domain->lock, flags);
> -
> - domain_detach_iommu(domain, iommu);
> - info->domain = NULL;
> -}
> -
> /*
> * Clear the page table pointer in context or pasid table entries so that
> * all DMA requests without PASID from the device are blocked. If the page
> @@ -4284,7 +4260,6 @@ static void intel_iommu_release_device(struct device *dev)
> {
> struct device_domain_info *info = dev_iommu_priv_get(dev);
>
> - dmar_remove_one_dev_info(dev);
> intel_pasid_free_table(dev);
> intel_iommu_debugfs_remove_dev(info);
> kfree(info);
> @@ -4743,6 +4718,7 @@ static const struct iommu_dirty_ops intel_dirty_ops = {
>
> const struct iommu_ops intel_iommu_ops = {
> .blocked_domain = &blocking_domain,
> + .release_domain = &blocking_domain,
> .capable = intel_iommu_capable,
> .hw_info = intel_iommu_hw_info,
> .domain_alloc = intel_iommu_domain_alloc,
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index 068f6508e57c9b..9fbbd83a82d4e9 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -442,6 +442,8 @@ static int iommu_init_device(struct device *dev, const struct iommu_ops *ops)
> err_unlink:
> iommu_device_unlink(iommu_dev, dev);
> err_release:
> + if (ops->release_domain)
> + ops->release_domain->ops->attach_dev(ops->release_domain, dev);
> if (ops->release_device)
> ops->release_device(dev);
> err_module_put:
> @@ -461,6 +463,15 @@ static void iommu_deinit_device(struct device *dev)
>
> iommu_device_unlink(dev->iommu->iommu_dev, dev);
>
> + /*
> + * If the iommu driver provides release_domain then the core code
> + * ensures that domain is attached prior to calling release_device.
> + * Drivers can use this to enforce a translation on the idle iommu.
> + * Usually the global static blocked_domain is a good choice.
> + */
> + if (ops->release_domain)
> + ops->release_domain->ops->attach_dev(ops->release_domain, dev);
> +
> /*
> * release_device() must stop using any attached domain on the device.
> * If there are still other devices in the group they are not effected
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index c34d0ef1703d68..1e8be165c878d8 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -511,6 +511,7 @@ struct iommu_ops {
> struct module *owner;
> struct iommu_domain *identity_domain;
> struct iommu_domain *blocked_domain;
> + struct iommu_domain *release_domain;
> struct iommu_domain *default_domain;
> };
>

FWIW, I took this patch for a spin, appears to work fine.

Thanks,
Eric

2024-01-31 12:09:23

by Baolu Lu

[permalink] [raw]
Subject: Re: [PATCH] iommu/vt-d: Check for non-NULL domain on device release

On 2024/1/16 23:22, Jason Gunthorpe wrote:
>> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
>> index 6fb5f6fceea1..26e450259889 100644
>> --- a/drivers/iommu/intel/iommu.c
>> +++ b/drivers/iommu/intel/iommu.c
>> @@ -3750,7 +3750,6 @@ static void domain_context_clear(struct device_domain_info *info)
>> static void dmar_remove_one_dev_info(struct device *dev)
>> {
>> struct device_domain_info *info = dev_iommu_priv_get(dev);
>> - struct dmar_domain *domain = info->domain;
>> struct intel_iommu *iommu = info->iommu;
>> unsigned long flags;
>>
>> @@ -3763,11 +3762,14 @@ static void dmar_remove_one_dev_info(struct device *dev)
>> domain_context_clear(info);
>> }
>>
>> - spin_lock_irqsave(&domain->lock, flags);
>> + if (!info->domain)
>> + return;
>> +
>> + spin_lock_irqsave(&info->domain->lock, flags);
>> list_del(&info->link);
>> - spin_unlock_irqrestore(&domain->lock, flags);
>> + spin_unlock_irqrestore(&info->domain->lock, flags);
>>
>> - domain_detach_iommu(domain, iommu);
>> + domain_detach_iommu(info->domain, iommu);
>> info->domain = NULL;
>> }
> This function is almost a copy of device_block_translation(), with
> some bugs added :\
>
> Lu can they be made the same? It seems to me BLOCKED could always
> clear the domain context, even in scalable mode?

Yes. A part of dmar_remove_one_dev_info() is duplicated with
device_block_translation().

>
> Anyhow, my suggestion is more like:

I like this suggestion.

Currently, the device_release callback for an iommu driver does the
following:

a) Silent IOMMU DMA translation. This is done by detaching the existing
domain from the IOMMU and bringing the IOMMU into a blocking state
(some drivers might be in identity state);
b) Releases the resources allocated in the probe callback and restores
the device to its previous state before the probe.

From my understanding of your suggestion, we should move a) out of the
release callback and make it a special domain, which could be a blocking
domain or identity domain, depending on the iommu hardware.

In the end, the release_device callback of an iommu driver should focus
on the opposite operation of device_probe. This makes the concept
clearer.

Best regards,
baolu

2024-02-21 15:46:12

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH] iommu/vt-d: Check for non-NULL domain on device release

On Wed, Jan 31, 2024 at 03:10:53PM +0800, Baolu Lu wrote:
> I like this suggestion.
>
> Currently, the device_release callback for an iommu driver does the
> following:
>
> a) Silent IOMMU DMA translation. This is done by detaching the existing
> domain from the IOMMU and bringing the IOMMU into a blocking state
> (some drivers might be in identity state);
> b) Releases the resources allocated in the probe callback and restores
> the device to its previous state before the probe.
>
> From my understanding of your suggestion, we should move a) out of the
> release callback and make it a special domain, which could be a blocking
> domain or identity domain, depending on the iommu hardware.
>
> In the end, the release_device callback of an iommu driver should focus
> on the opposite operation of device_probe. This makes the concept
> clearer.

Right

Can someone make some patches to fix Eric's bug? We don't really need
to do the release_domain stuff if the driver just self-attaches one of
its known static domain types (blocking/identity)

Jason

2024-02-22 15:00:24

by Baolu Lu

[permalink] [raw]
Subject: Re: [PATCH] iommu/vt-d: Check for non-NULL domain on device release

On 2024/2/21 23:40, Jason Gunthorpe wrote:
> On Wed, Jan 31, 2024 at 03:10:53PM +0800, Baolu Lu wrote:
>> I like this suggestion.
>>
>> Currently, the device_release callback for an iommu driver does the
>> following:
>>
>> a) Silent IOMMU DMA translation. This is done by detaching the existing
>> domain from the IOMMU and bringing the IOMMU into a blocking state
>> (some drivers might be in identity state);
>> b) Releases the resources allocated in the probe callback and restores
>> the device to its previous state before the probe.
>>
>> From my understanding of your suggestion, we should move a) out of the
>> release callback and make it a special domain, which could be a blocking
>> domain or identity domain, depending on the iommu hardware.
>>
>> In the end, the release_device callback of an iommu driver should focus
>> on the opposite operation of device_probe. This makes the concept
>> clearer.
> Right
>
> Can someone make some patches to fix Eric's bug? We don't really need
> to do the release_domain stuff if the driver just self-attaches one of
> its known static domain types (blocking/identity)

I will follow up with a formal patch series.

Best regards,
baolu