2024-02-29 03:32:02

by Ethan Zhao

[permalink] [raw]
Subject: [PATCH] iommu/vt-d: avoid sending explicit ATS invalidation request to released device

The introduction of per iommu device rbtree also defines the lifetime of
interoperation between iommu and devices, if the device has been released
from device rbtree, no need to send ATS invalidation request to it anymore,
thus avoid the possibility of later ITE fault to be triggered.

This is part of the followup of prior proposed patchset

https://do-db2.lkml.org/lkml/2024/2/22/350

To make sure all the devTLB entries to be invalidated in the device release
path, do implict invalidation by fapping the E bit of ATS control register.
see PCIe spec v6.2, sec 10.3.7 implicit invalidation events.

Fixes: 6f7db75e1c46 ("iommu/vt-d: Add second level page table interface")
Signed-off-by: Ethan Zhao <[email protected]>
---
drivers/iommu/intel/iommu.c | 6 ++++++
drivers/iommu/intel/pasid.c | 7 +++++++
2 files changed, 13 insertions(+)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 6743fe6c7a36..b85d88ccb4b0 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -1368,6 +1368,12 @@ static void iommu_disable_pci_caps(struct device_domain_info *info)
pdev = to_pci_dev(info->dev);

if (info->ats_enabled) {
+ pci_disable_ats(pdev);
+ /*
+ * flap the E bit of ATS control register to do implicit
+ * ATS invlidation, see PCIe spec v6.2, sec 10.3.7
+ */
+ pci_enable_ats(pdev, VTD_PAGE_SHIFT);
pci_disable_ats(pdev);
info->ats_enabled = 0;
domain_update_iotlb(info->domain);
diff --git a/drivers/iommu/intel/pasid.c b/drivers/iommu/intel/pasid.c
index 108158e2b907..5f13e017a12c 100644
--- a/drivers/iommu/intel/pasid.c
+++ b/drivers/iommu/intel/pasid.c
@@ -215,6 +215,13 @@ devtlb_invalidation_with_pasid(struct intel_iommu *iommu,
return;

sid = info->bus << 8 | info->devfn;
+ /*
+ * If device has been released from rbtree, no need to send ATS
+ * Invalidation request anymore, that could cause ITE fault.
+ */
+ if (!device_rbtree_find(iommu, sid))
+ return;
+
qdep = info->ats_qdep;
pfsid = info->pfsid;

--
2.31.1



2024-02-29 03:34:35

by Ethan Zhao

[permalink] [raw]
Subject: Re: [PATCH] iommu/vt-d: avoid sending explicit ATS invalidation request to released device

On 2/29/2024 11:31 AM, Ethan Zhao wrote:
> The introduction of per iommu device rbtree also defines the lifetime of
> interoperation between iommu and devices, if the device has been released
> from device rbtree, no need to send ATS invalidation request to it anymore,
> thus avoid the possibility of later ITE fault to be triggered.
>
> This is part of the followup of prior proposed patchset
>
> https://do-db2.lkml.org/lkml/2024/2/22/350
>
> To make sure all the devTLB entries to be invalidated in the device release
> path, do implict invalidation by fapping the E bit of ATS control register.
> see PCIe spec v6.2, sec 10.3.7 implicit invalidation events.
>
> Fixes: 6f7db75e1c46 ("iommu/vt-d: Add second level page table interface")
> Signed-off-by: Ethan Zhao <[email protected]>
> ---
> drivers/iommu/intel/iommu.c | 6 ++++++
> drivers/iommu/intel/pasid.c | 7 +++++++
> 2 files changed, 13 insertions(+)
>
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index 6743fe6c7a36..b85d88ccb4b0 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -1368,6 +1368,12 @@ static void iommu_disable_pci_caps(struct device_domain_info *info)
> pdev = to_pci_dev(info->dev);
>
> if (info->ats_enabled) {
> + pci_disable_ats(pdev);
> + /*
> + * flap the E bit of ATS control register to do implicit
> + * ATS invlidation, see PCIe spec v6.2, sec 10.3.7
> + */
> + pci_enable_ats(pdev, VTD_PAGE_SHIFT);
> pci_disable_ats(pdev);
> info->ats_enabled = 0;
> domain_update_iotlb(info->domain);
> diff --git a/drivers/iommu/intel/pasid.c b/drivers/iommu/intel/pasid.c
> index 108158e2b907..5f13e017a12c 100644
> --- a/drivers/iommu/intel/pasid.c
> +++ b/drivers/iommu/intel/pasid.c
> @@ -215,6 +215,13 @@ devtlb_invalidation_with_pasid(struct intel_iommu *iommu,
> return;
>
> sid = info->bus << 8 | info->devfn;
> + /*
> + * If device has been released from rbtree, no need to send ATS
> + * Invalidation request anymore, that could cause ITE fault.
> + */
> + if (!device_rbtree_find(iommu, sid))
> + return;
> +
> qdep = info->ats_qdep;
> pfsid = info->pfsid;
>

This patch based on Baolu's per iommu device rbtree patchset
https://github.com/LuBaolu/intel-iommu/commits/rbtree-for-device-info-v2


Thanks,
Ethan


2024-02-29 21:16:04

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH] iommu/vt-d: avoid sending explicit ATS invalidation request to released device

On Wed, Feb 28, 2024 at 10:31:38PM -0500, Ethan Zhao wrote:
> The introduction of per iommu device rbtree also defines the lifetime of
> interoperation between iommu and devices, if the device has been released
> from device rbtree, no need to send ATS invalidation request to it anymore,
> thus avoid the possibility of later ITE fault to be triggered.
>
> This is part of the followup of prior proposed patchset
>
> https://do-db2.lkml.org/lkml/2024/2/22/350

Please use https://lore.kernel.org/ URLs instead. This one looks like
https://lore.kernel.org/r/[email protected]/

> To make sure all the devTLB entries to be invalidated in the device release
> path, do implict invalidation by fapping the E bit of ATS control register.
> see PCIe spec v6.2, sec 10.3.7 implicit invalidation events.

s/implict/implicit/

s/fapping/?/ (no idea :) "flipping"? Oh, probably "flapping" per the
comment below. But I think "flapping" is ambiguous; "setting" would be
better)

s/v6.2/r6.2/ (also below in comment)

> Fixes: 6f7db75e1c46 ("iommu/vt-d: Add second level page table interface")
> Signed-off-by: Ethan Zhao <[email protected]>
> ---
> drivers/iommu/intel/iommu.c | 6 ++++++
> drivers/iommu/intel/pasid.c | 7 +++++++
> 2 files changed, 13 insertions(+)
>
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index 6743fe6c7a36..b85d88ccb4b0 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -1368,6 +1368,12 @@ static void iommu_disable_pci_caps(struct device_domain_info *info)
> pdev = to_pci_dev(info->dev);
>
> if (info->ats_enabled) {
> + pci_disable_ats(pdev);
> + /*
> + * flap the E bit of ATS control register to do implicit
> + * ATS invlidation, see PCIe spec v6.2, sec 10.3.7

s/invlidation/invalidation/

I would put the comment above the pci_disable_ats(), so it looks like
this:

/* comment ... */
pci_disable_ats(pdev);
pci_enable_ats(pdev, VTD_PAGE_SHIFT);

pci_disable_ats(pdev);

because the spec says the E field must change from Clear to Set to
cause invalidation of all ATC entries, so it's important to see that
we clear E first, then set it.

> + */
> + pci_enable_ats(pdev, VTD_PAGE_SHIFT);
> pci_disable_ats(pdev);
> info->ats_enabled = 0;
> domain_update_iotlb(info->domain);
> diff --git a/drivers/iommu/intel/pasid.c b/drivers/iommu/intel/pasid.c
> index 108158e2b907..5f13e017a12c 100644
> --- a/drivers/iommu/intel/pasid.c
> +++ b/drivers/iommu/intel/pasid.c
> @@ -215,6 +215,13 @@ devtlb_invalidation_with_pasid(struct intel_iommu *iommu,
> return;
>
> sid = info->bus << 8 | info->devfn;
> + /*
> + * If device has been released from rbtree, no need to send ATS
> + * Invalidation request anymore, that could cause ITE fault.
> + */
> + if (!device_rbtree_find(iommu, sid))
> + return;
> +
> qdep = info->ats_qdep;
> pfsid = info->pfsid;
>
> --
> 2.31.1
>

2024-03-01 01:50:51

by Ethan Zhao

[permalink] [raw]
Subject: Re: [PATCH] iommu/vt-d: avoid sending explicit ATS invalidation request to released device


On 3/1/2024 5:06 AM, Bjorn Helgaas wrote:
> On Wed, Feb 28, 2024 at 10:31:38PM -0500, Ethan Zhao wrote:
>> The introduction of per iommu device rbtree also defines the lifetime of
>> interoperation between iommu and devices, if the device has been released
>> from device rbtree, no need to send ATS invalidation request to it anymore,
>> thus avoid the possibility of later ITE fault to be triggered.
>>
>> This is part of the followup of prior proposed patchset
>>
>> https://do-db2.lkml.org/lkml/2024/2/22/350
> Please use https://lore.kernel.org/ URLs instead. This one looks like
> https://lore.kernel.org/r/[email protected]/
>
>> To make sure all the devTLB entries to be invalidated in the device release
>> path, do implict invalidation by fapping the E bit of ATS control register.
>> see PCIe spec v6.2, sec 10.3.7 implicit invalidation events.
> s/implict/implicit/
>
> s/fapping/?/ (no idea :) "flipping"? Oh, probably "flapping" per the
> comment below. But I think "flapping" is ambiguous; "setting" would be
> better)

Yup, like the memory bit flipping, no idea what is the right word,
setting one bit to 0, then 1, then back to 0. perhaps details the
setting action 0-->1-->0 ?

> s/v6.2/r6.2/ (also below in comment)

got it.

>
>> Fixes: 6f7db75e1c46 ("iommu/vt-d: Add second level page table interface")
>> Signed-off-by: Ethan Zhao <[email protected]>
>> ---
>> drivers/iommu/intel/iommu.c | 6 ++++++
>> drivers/iommu/intel/pasid.c | 7 +++++++
>> 2 files changed, 13 insertions(+)
>>
>> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
>> index 6743fe6c7a36..b85d88ccb4b0 100644
>> --- a/drivers/iommu/intel/iommu.c
>> +++ b/drivers/iommu/intel/iommu.c
>> @@ -1368,6 +1368,12 @@ static void iommu_disable_pci_caps(struct device_domain_info *info)
>> pdev = to_pci_dev(info->dev);
>>
>> if (info->ats_enabled) {
>> + pci_disable_ats(pdev);
>> + /*
>> + * flap the E bit of ATS control register to do implicit
>> + * ATS invlidation, see PCIe spec v6.2, sec 10.3.7
> s/invlidation/invalidation/
>
> I would put the comment above the pci_disable_ats(), so it looks like
> this:
>
> /* comment ... */
> pci_disable_ats(pdev);
> pci_enable_ats(pdev, VTD_PAGE_SHIFT);
>
> pci_disable_ats(pdev);
>
> because the spec says the E field must change from Clear to Set to
> cause invalidation of all ATC entries, so it's important to see that
> we clear E first, then set it.

Great, will correct.

Thanks,
Ethan

>
>> + */
>> + pci_enable_ats(pdev, VTD_PAGE_SHIFT);
>> pci_disable_ats(pdev);
>> info->ats_enabled = 0;
>> domain_update_iotlb(info->domain);
>> diff --git a/drivers/iommu/intel/pasid.c b/drivers/iommu/intel/pasid.c
>> index 108158e2b907..5f13e017a12c 100644
>> --- a/drivers/iommu/intel/pasid.c
>> +++ b/drivers/iommu/intel/pasid.c
>> @@ -215,6 +215,13 @@ devtlb_invalidation_with_pasid(struct intel_iommu *iommu,
>> return;
>>
>> sid = info->bus << 8 | info->devfn;
>> + /*
>> + * If device has been released from rbtree, no need to send ATS
>> + * Invalidation request anymore, that could cause ITE fault.
>> + */
>> + if (!device_rbtree_find(iommu, sid))
>> + return;
>> +
>> qdep = info->ats_qdep;
>> pfsid = info->pfsid;
>>
>> --
>> 2.31.1
>>

2024-03-01 07:04:51

by Ethan Zhao

[permalink] [raw]
Subject: Re: [PATCH] iommu/vt-d: avoid sending explicit ATS invalidation request to released device


On 2/29/2024 11:31 AM, Ethan Zhao wrote:
> The introduction of per iommu device rbtree also defines the lifetime of
> interoperation between iommu and devices, if the device has been released
> from device rbtree, no need to send ATS invalidation request to it anymore,
> thus avoid the possibility of later ITE fault to be triggered.
>
> This is part of the followup of prior proposed patchset
>
> https://do-db2.lkml.org/lkml/2024/2/22/350
>
> To make sure all the devTLB entries to be invalidated in the device release
> path, do implict invalidation by fapping the E bit of ATS control register.
> see PCIe spec v6.2, sec 10.3.7 implicit invalidation events.
>
> Fixes: 6f7db75e1c46 ("iommu/vt-d: Add second level page table interface")
> Signed-off-by: Ethan Zhao <[email protected]>
> ---
> drivers/iommu/intel/iommu.c | 6 ++++++
> drivers/iommu/intel/pasid.c | 7 +++++++
> 2 files changed, 13 insertions(+)
>
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index 6743fe6c7a36..b85d88ccb4b0 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -1368,6 +1368,12 @@ static void iommu_disable_pci_caps(struct device_domain_info *info)
> pdev = to_pci_dev(info->dev);
>
> if (info->ats_enabled) {
> + pci_disable_ats(pdev);
> + /*
> + * flap the E bit of ATS control register to do implicit
> + * ATS invlidation, see PCIe spec v6.2, sec 10.3.7
> + */
> + pci_enable_ats(pdev, VTD_PAGE_SHIFT);
> pci_disable_ats(pdev);
> info->ats_enabled = 0;
> domain_update_iotlb(info->domain);
> diff --git a/drivers/iommu/intel/pasid.c b/drivers/iommu/intel/pasid.c
> index 108158e2b907..5f13e017a12c 100644
> --- a/drivers/iommu/intel/pasid.c
> +++ b/drivers/iommu/intel/pasid.c
> @@ -215,6 +215,13 @@ devtlb_invalidation_with_pasid(struct intel_iommu *iommu,
> return;
>
> sid = info->bus << 8 | info->devfn;
> + /*
> + * If device has been released from rbtree, no need to send ATS
> + * Invalidation request anymore, that could cause ITE fault.
> + */
> + if (!device_rbtree_find(iommu, sid))
> + return;
> +
> qdep = info->ats_qdep;
> pfsid = info->pfsid;
>

Given maintainer is going to pick up patchset
https://lore.kernel.org/lkml/[email protected]/T/
and this one is mutually exclusive with it, suspend.


Thanks,
Ethan


2024-03-01 21:56:38

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH] iommu/vt-d: avoid sending explicit ATS invalidation request to released device

On Fri, Mar 01, 2024 at 09:50:36AM +0800, Ethan Zhao wrote:
> On 3/1/2024 5:06 AM, Bjorn Helgaas wrote:
> > On Wed, Feb 28, 2024 at 10:31:38PM -0500, Ethan Zhao wrote:
> > > The introduction of per iommu device rbtree also defines the lifetime of
> > > interoperation between iommu and devices, if the device has been released
> > > from device rbtree, no need to send ATS invalidation request to it anymore,
> > > thus avoid the possibility of later ITE fault to be triggered.
> > >
> > > This is part of the followup of prior proposed patchset
> > >
> > > https://do-db2.lkml.org/lkml/2024/2/22/350
> > Please use https://lore.kernel.org/ URLs instead. This one looks like
> > https://lore.kernel.org/r/[email protected]/
> >
> > > To make sure all the devTLB entries to be invalidated in the device release
> > > path, do implict invalidation by fapping the E bit of ATS control register.
> > > see PCIe spec v6.2, sec 10.3.7 implicit invalidation events.
> > s/implict/implicit/
> >
> > s/fapping/?/ (no idea :) "flipping"? Oh, probably "flapping" per the
> > comment below. But I think "flapping" is ambiguous; "setting" would be
> > better)
>
> Yup, like the memory bit flipping, no idea what is the right word,
> setting one bit to 0, then 1, then back to 0. perhaps details the
> setting action 0-->1-->0 ?

In PCIe spec-speak, "Set" means "assign 1 to this", and "Clear" means
"assign 0 to this".

Maybe you could copy the spec language like this:

Invalidate all ATC entries by changing the E field in the ATS
Capability from Clear to Set, which causes an implicit invalidation
event.

Bjorn

2024-03-04 03:10:21

by Ethan Zhao

[permalink] [raw]
Subject: Re: [PATCH] iommu/vt-d: avoid sending explicit ATS invalidation request to released device

On 3/2/2024 5:56 AM, Bjorn Helgaas wrote:
> On Fri, Mar 01, 2024 at 09:50:36AM +0800, Ethan Zhao wrote:
>> On 3/1/2024 5:06 AM, Bjorn Helgaas wrote:
>>> On Wed, Feb 28, 2024 at 10:31:38PM -0500, Ethan Zhao wrote:
>>>> The introduction of per iommu device rbtree also defines the lifetime of
>>>> interoperation between iommu and devices, if the device has been released
>>>> from device rbtree, no need to send ATS invalidation request to it anymore,
>>>> thus avoid the possibility of later ITE fault to be triggered.
>>>>
>>>> This is part of the followup of prior proposed patchset
>>>>
>>>> https://do-db2.lkml.org/lkml/2024/2/22/350
>>> Please use https://lore.kernel.org/ URLs instead. This one looks like
>>> https://lore.kernel.org/r/[email protected]/
>>>
>>>> To make sure all the devTLB entries to be invalidated in the device release
>>>> path, do implict invalidation by fapping the E bit of ATS control register.
>>>> see PCIe spec v6.2, sec 10.3.7 implicit invalidation events.
>>> s/implict/implicit/
>>>
>>> s/fapping/?/ (no idea :) "flipping"? Oh, probably "flapping" per the
>>> comment below. But I think "flapping" is ambiguous; "setting" would be
>>> better)
>> Yup, like the memory bit flipping, no idea what is the right word,
>> setting one bit to 0, then 1, then back to 0. perhaps details the
>> setting action 0-->1-->0 ?
> In PCIe spec-speak, "Set" means "assign 1 to this", and "Clear" means
> "assign 0 to this".
>
> Maybe you could copy the spec language like this:
>
> Invalidate all ATC entries by changing the E field in the ATS
> Capability from Clear to Set, which causes an implicit invalidation
> event.

Fair enough.

Thanks,
Ethan

>
> Bjorn
>