2024-02-23 05:19:11

by Baolu Lu

[permalink] [raw]
Subject: [PATCH 0/2] iommu: Fix domain check on device release

This is a follow-up to the discussion thread here:

https://lore.kernel.org/linux-iommu/[email protected]/

It fixes a NULL pointer reference in the Intel iommu driver and
strengthens the iommu core to possibly prevent similar failures in other
iommu drivers.

Best regards,
baolu

Lu Baolu (2):
iommu: Add static iommu_ops->release_domain
iommu/vt-d: Fix NULL domain on device release

include/linux/iommu.h | 1 +
drivers/iommu/intel/iommu.c | 26 +-------------------------
drivers/iommu/iommu.c | 12 ++++++++++++
3 files changed, 14 insertions(+), 25 deletions(-)

--
2.34.1



2024-02-23 05:19:21

by Baolu Lu

[permalink] [raw]
Subject: [PATCH 1/2] iommu: Add static iommu_ops->release_domain

The current device_release callback for individual iommu drivers does the
following:

1) Silent IOMMU DMA translation: It detaches any existing domain from the
device and puts it into a blocking state (some drivers might use the
identity state).
2) Resource release: It releases resources allocated during the
device_probe callback and restores the device to its pre-probe state.

Step 1 is challenging for individual iommu drivers because each must check
if a domain is already attached to the device. Additionally, if a deferred
attach never occurred, the device_release should avoid modifying hardware
configuration regardless of the reason for its call.

To simplify this process, introduce a static release_domain within the
iommu_ops structure. It can be either a blocking or identity domain
depending on the iommu hardware. The iommu core will decide whether to
attach this domain before the device_release callback, eliminating the
need for repetitive code in various drivers.

Consequently, the device_release callback can focus solely on the opposite
operations of device_probe, including releasing all resources allocated
during that callback.

Co-developed-by: Jason Gunthorpe <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
Signed-off-by: Lu Baolu <[email protected]>
---
include/linux/iommu.h | 1 +
drivers/iommu/iommu.c | 12 ++++++++++++
2 files changed, 13 insertions(+)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index de839fd01bb8..e3d9365b0fa9 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -585,6 +585,7 @@ struct iommu_ops {
struct module *owner;
struct iommu_domain *identity_domain;
struct iommu_domain *blocked_domain;
+ struct iommu_domain *release_domain;
struct iommu_domain *default_domain;
};

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 210dc7b4c8cf..fb06c3f47320 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -459,6 +459,18 @@ static void iommu_deinit_device(struct device *dev)

iommu_device_unlink(dev->iommu->iommu_dev, dev);

+ /*
+ * If the iommu driver provides release_domain then the core code
+ * ensures that domain is attached prior to calling release_device.
+ * Drivers can use this to enforce a translation on the idle iommu.
+ * Usually the global static blocked_domain is a good choice.
+ *
+ * Anyway, if a deferred attach never happened then the release
+ * should still avoid touching any hardware configuration either.
+ */
+ if (!dev->iommu->attach_deferred && ops->release_domain)
+ ops->release_domain->ops->attach_dev(ops->release_domain, dev);
+
/*
* release_device() must stop using any attached domain on the device.
* If there are still other devices in the group they are not effected
--
2.34.1


2024-02-23 09:43:48

by Baolu Lu

[permalink] [raw]
Subject: [PATCH 2/2] iommu/vt-d: Fix NULL domain on device release

In the kdump kernel, the IOMMU operates in deferred_attach mode. In this
mode, info->domain may not yet be assigned by the time the release_device
function is called. It leads to the following crash in the crash kernel:

BUG: kernel NULL pointer dereference, address: 000000000000003c
...
RIP: 0010:do_raw_spin_lock+0xa/0xa0
...
_raw_spin_lock_irqsave+0x1b/0x30
intel_iommu_release_device+0x96/0x170
iommu_deinit_device+0x39/0xf0
__iommu_group_remove_device+0xa0/0xd0
iommu_bus_notifier+0x55/0xb0
notifier_call_chain+0x5a/0xd0
blocking_notifier_call_chain+0x41/0x60
bus_notify+0x34/0x50
device_del+0x269/0x3d0
pci_remove_bus_device+0x77/0x100
p2sb_bar+0xae/0x1d0
...
i801_probe+0x423/0x740

Use the release_domain mechanism to fix it.

Fixes: 586081d3f6b1 ("iommu/vt-d: Remove DEFER_DEVICE_DOMAIN_INFO")
Reported-by: Eric Badger <[email protected]>
Closes: https://lore.kernel.org/r/[email protected]
Signed-off-by: Lu Baolu <[email protected]>
---
drivers/iommu/intel/iommu.c | 26 +-------------------------
1 file changed, 1 insertion(+), 25 deletions(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 61bb35046ea4..2c04ea90d22f 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -3798,30 +3798,6 @@ static void domain_context_clear(struct device_domain_info *info)
&domain_context_clear_one_cb, info);
}

-static void dmar_remove_one_dev_info(struct device *dev)
-{
- struct device_domain_info *info = dev_iommu_priv_get(dev);
- struct dmar_domain *domain = info->domain;
- struct intel_iommu *iommu = info->iommu;
- unsigned long flags;
-
- if (!dev_is_real_dma_subdevice(info->dev)) {
- if (dev_is_pci(info->dev) && sm_supported(iommu))
- intel_pasid_tear_down_entry(iommu, info->dev,
- IOMMU_NO_PASID, false);
-
- iommu_disable_pci_caps(info);
- domain_context_clear(info);
- }
-
- spin_lock_irqsave(&domain->lock, flags);
- list_del(&info->link);
- spin_unlock_irqrestore(&domain->lock, flags);
-
- domain_detach_iommu(domain, iommu);
- info->domain = NULL;
-}
-
/*
* Clear the page table pointer in context or pasid table entries so that
* all DMA requests without PASID from the device are blocked. If the page
@@ -4348,7 +4324,6 @@ static void intel_iommu_release_device(struct device *dev)
{
struct device_domain_info *info = dev_iommu_priv_get(dev);

- dmar_remove_one_dev_info(dev);
intel_pasid_free_table(dev);
intel_iommu_debugfs_remove_dev(info);
kfree(info);
@@ -4839,6 +4814,7 @@ static const struct iommu_dirty_ops intel_dirty_ops = {

const struct iommu_ops intel_iommu_ops = {
.blocked_domain = &blocking_domain,
+ .release_domain = &blocking_domain,
.capable = intel_iommu_capable,
.hw_info = intel_iommu_hw_info,
.domain_alloc = intel_iommu_domain_alloc,
--
2.34.1


2024-02-27 07:33:09

by Tian, Kevin

[permalink] [raw]
Subject: RE: [PATCH 1/2] iommu: Add static iommu_ops->release_domain

> From: Lu Baolu <[email protected]>
> Sent: Friday, February 23, 2024 1:13 PM
>
>
> + /*
> + * If the iommu driver provides release_domain then the core code
> + * ensures that domain is attached prior to calling release_device.
> + * Drivers can use this to enforce a translation on the idle iommu.

'enforce a translation' is confusing in the context of blocking/identity
domain.

otherwise looks good to me:

Reviewed-by: Kevin Tian <[email protected]>

2024-02-27 07:45:40

by Tian, Kevin

[permalink] [raw]
Subject: RE: [PATCH 2/2] iommu/vt-d: Fix NULL domain on device release

> From: Lu Baolu <[email protected]>
> Sent: Friday, February 23, 2024 1:13 PM
>
> -static void dmar_remove_one_dev_info(struct device *dev)
> -{
> - struct device_domain_info *info = dev_iommu_priv_get(dev);
> - struct dmar_domain *domain = info->domain;
> - struct intel_iommu *iommu = info->iommu;
> - unsigned long flags;
> -
> - if (!dev_is_real_dma_subdevice(info->dev)) {
> - if (dev_is_pci(info->dev) && sm_supported(iommu))
> - intel_pasid_tear_down_entry(iommu, info->dev,
> - IOMMU_NO_PASID, false);
> -
> - iommu_disable_pci_caps(info);
> - domain_context_clear(info);
> - }
> -
> - spin_lock_irqsave(&domain->lock, flags);
> - list_del(&info->link);
> - spin_unlock_irqrestore(&domain->lock, flags);
> -
> - domain_detach_iommu(domain, iommu);
> - info->domain = NULL;
> -}
> -

what's required here is slightly different from device_block_translation()
which leaves context entry uncleared in scalable mode (implying the
pasid table must be valid). but in the release path the pasid table will
be freed right after then leading to a use-after-free case.

let's add an explicit domain_context_clear() in intel_iommu_release_device().

with that,

Reviewed-by: Kevin Tian <[email protected]>

2024-02-28 01:19:53

by Baolu Lu

[permalink] [raw]
Subject: Re: [PATCH 1/2] iommu: Add static iommu_ops->release_domain

On 2/27/24 3:32 PM, Tian, Kevin wrote:
>> From: Lu Baolu<[email protected]>
>> Sent: Friday, February 23, 2024 1:13 PM
>>
>>
>> + /*
>> + * If the iommu driver provides release_domain then the core code
>> + * ensures that domain is attached prior to calling release_device.
>> + * Drivers can use this to enforce a translation on the idle iommu.
> 'enforce a translation' is confusing in the context of blocking/identity
> domain.

Blocking or identity domain is also kind of a translation from the
core's perspective. The core does not care what type of translation the
release_domain is; it just enforces this type of translation before
device release if the driver has specified one.

Best regards,
baolu

2024-02-28 01:28:59

by Baolu Lu

[permalink] [raw]
Subject: Re: [PATCH 2/2] iommu/vt-d: Fix NULL domain on device release

On 2/27/24 3:40 PM, Tian, Kevin wrote:
>> From: Lu Baolu<[email protected]>
>> Sent: Friday, February 23, 2024 1:13 PM
>>
>> -static void dmar_remove_one_dev_info(struct device *dev)
>> -{
>> - struct device_domain_info *info = dev_iommu_priv_get(dev);
>> - struct dmar_domain *domain = info->domain;
>> - struct intel_iommu *iommu = info->iommu;
>> - unsigned long flags;
>> -
>> - if (!dev_is_real_dma_subdevice(info->dev)) {
>> - if (dev_is_pci(info->dev) && sm_supported(iommu))
>> - intel_pasid_tear_down_entry(iommu, info->dev,
>> - IOMMU_NO_PASID, false);
>> -
>> - iommu_disable_pci_caps(info);
>> - domain_context_clear(info);
>> - }
>> -
>> - spin_lock_irqsave(&domain->lock, flags);
>> - list_del(&info->link);
>> - spin_unlock_irqrestore(&domain->lock, flags);
>> -
>> - domain_detach_iommu(domain, iommu);
>> - info->domain = NULL;
>> -}
>> -
> what's required here is slightly different from device_block_translation()
> which leaves context entry uncleared in scalable mode (implying the
> pasid table must be valid). but in the release path the pasid table will
> be freed right after then leading to a use-after-free case.
>
> let's add an explicit domain_context_clear() in intel_iommu_release_device().

Nice catch!

How about moving the scalable mode context entry management to probe and
release path? Currently, it's part of domain switch, that's really
irrelevant.

Best regards,
baolu

2024-02-28 03:08:38

by Tian, Kevin

[permalink] [raw]
Subject: RE: [PATCH 2/2] iommu/vt-d: Fix NULL domain on device release

> From: Baolu Lu <[email protected]>
> Sent: Wednesday, February 28, 2024 9:23 AM
>
> On 2/27/24 3:40 PM, Tian, Kevin wrote:
> >> From: Lu Baolu<[email protected]>
> >> Sent: Friday, February 23, 2024 1:13 PM
> >>
> >> -static void dmar_remove_one_dev_info(struct device *dev)
> >> -{
> >> - struct device_domain_info *info = dev_iommu_priv_get(dev);
> >> - struct dmar_domain *domain = info->domain;
> >> - struct intel_iommu *iommu = info->iommu;
> >> - unsigned long flags;
> >> -
> >> - if (!dev_is_real_dma_subdevice(info->dev)) {
> >> - if (dev_is_pci(info->dev) && sm_supported(iommu))
> >> - intel_pasid_tear_down_entry(iommu, info->dev,
> >> - IOMMU_NO_PASID, false);
> >> -
> >> - iommu_disable_pci_caps(info);
> >> - domain_context_clear(info);
> >> - }
> >> -
> >> - spin_lock_irqsave(&domain->lock, flags);
> >> - list_del(&info->link);
> >> - spin_unlock_irqrestore(&domain->lock, flags);
> >> -
> >> - domain_detach_iommu(domain, iommu);
> >> - info->domain = NULL;
> >> -}
> >> -
> > what's required here is slightly different from device_block_translation()
> > which leaves context entry uncleared in scalable mode (implying the
> > pasid table must be valid). but in the release path the pasid table will
> > be freed right after then leading to a use-after-free case.
> >
> > let's add an explicit domain_context_clear() in
> intel_iommu_release_device().
>
> Nice catch!
>
> How about moving the scalable mode context entry management to probe
> and
> release path? Currently, it's part of domain switch, that's really
> irrelevant.
>

sounds good.

2024-02-28 03:36:07

by Baolu Lu

[permalink] [raw]
Subject: Re: [PATCH 1/2] iommu: Add static iommu_ops->release_domain

On 2/28/24 11:08 AM, Tian, Kevin wrote:
>> From: Baolu Lu <[email protected]>
>> Sent: Wednesday, February 28, 2024 9:14 AM
>>
>> On 2/27/24 3:32 PM, Tian, Kevin wrote:
>>>> From: Lu Baolu<[email protected]>
>>>> Sent: Friday, February 23, 2024 1:13 PM
>>>>
>>>>
>>>> + /*
>>>> + * If the iommu driver provides release_domain then the core code
>>>> + * ensures that domain is attached prior to calling release_device.
>>>> + * Drivers can use this to enforce a translation on the idle iommu.
>>> 'enforce a translation' is confusing in the context of blocking/identity
>>> domain.
>>
>> Blocking or identity domain is also kind of a translation from the
>> core's perspective. The core does not care what type of translation the
>> release_domain is; it just enforces this type of translation before
>> device release if the driver has specified one.
>>
>
> OK.
>
> btw you may also want to update the following comment together:
>
> /*
> * release_device() must stop using any attached domain on the device.
> * If there are still other devices in the group they are not effected
> * by this callback.
> *
> * The IOMMU driver must set the device to either an identity or
> * blocking translation and stop using any domain pointer, as it is
> * going to be freed.
> */
>
> all the purpose is to set the device to identity or blocking, either via attaching
> to an explicit release_domain or implicitly by release_device().

Done.

Best regards,
baolu

2024-02-28 03:45:36

by Tian, Kevin

[permalink] [raw]
Subject: RE: [PATCH 1/2] iommu: Add static iommu_ops->release_domain

> From: Baolu Lu <[email protected]>
> Sent: Wednesday, February 28, 2024 9:14 AM
>
> On 2/27/24 3:32 PM, Tian, Kevin wrote:
> >> From: Lu Baolu<[email protected]>
> >> Sent: Friday, February 23, 2024 1:13 PM
> >>
> >>
> >> + /*
> >> + * If the iommu driver provides release_domain then the core code
> >> + * ensures that domain is attached prior to calling release_device.
> >> + * Drivers can use this to enforce a translation on the idle iommu.
> > 'enforce a translation' is confusing in the context of blocking/identity
> > domain.
>
> Blocking or identity domain is also kind of a translation from the
> core's perspective. The core does not care what type of translation the
> release_domain is; it just enforces this type of translation before
> device release if the driver has specified one.
>

OK.

btw you may also want to update the following comment together:

/*
* release_device() must stop using any attached domain on the device.
* If there are still other devices in the group they are not effected
* by this callback.
*
* The IOMMU driver must set the device to either an identity or
* blocking translation and stop using any domain pointer, as it is
* going to be freed.
*/

all the purpose is to set the device to identity or blocking, either via attaching
to an explicit release_domain or implicitly by release_device().