2024-04-07 14:43:51

by Baolu Lu

[permalink] [raw]
Subject: [PATCH 1/2] iommu/vt-d: Avoid unnecessary device TLB flush in map path

iommu_flush_iotlb_psi() is called in map and unmap paths. The caching
mode check before device TLB invalidation will cause device TLB
invalidation always issued if IOMMU is not running in the caching mode.
This is inefficient and causes performance overhead.

Make device TLB invalidation behavior consistent between batched mode
unmapping and strict mode unmapping. Device TLB invalidation should only
be requested in the unmap path if the IOMMU is not in caching mode.

Fixes: bf92df30df90 ("intel-iommu: Only avoid flushing device IOTLB for domain ID 0 in caching mode")
Signed-off-by: Lu Baolu <[email protected]>
---
drivers/iommu/intel/iommu.c | 6 +-----
1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 50eb9aed47cc..493b6a600394 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -1501,11 +1501,7 @@ static void iommu_flush_iotlb_psi(struct intel_iommu *iommu,
else
__iommu_flush_iotlb_psi(iommu, did, pfn, pages, ih);

- /*
- * In caching mode, changes of pages from non-present to present require
- * flush. However, device IOTLB doesn't need to be flushed in this case.
- */
- if (!cap_caching_mode(iommu->cap) || !map)
+ if (!cap_caching_mode(iommu->cap) && !map)
iommu_flush_dev_iotlb(domain, addr, mask);
}

--
2.34.1



2024-04-07 14:43:58

by Baolu Lu

[permalink] [raw]
Subject: [PATCH 2/2] iommu/vt-d: Remove caching mode check before devtlb flush

The Caching Mode (CM) of the Intel IOMMU indicates if the hardware
implementation caches not-present or erroneous translation-structure
entries except the first-stage translation. The caching mode is
unrelated to the device TLB , therefore there is no need to check
it before a device TLB invalidation operation.

Before the scalable mode is introduced, caching mode is treated as
an indication that the driver is running in a VM guest. This is just
a software contract as shadow page table is the only way to implement
a virtual IOMMU. But the VT-d spec doesn't state this anywhere. After
the scalable mode is introduced, this doesn't stand for anymore, as
caching mode is not relevant for the first-stage translation. A virtual
IOMMU implementation is free to support first-stage translation only
with caching mode cleared.

Remove the caching mode check before device TLB invalidation to ensure
compatibility with the scalable mode use cases.

Fixes: 792fb43ce2c9 ("iommu/vt-d: Enable Intel IOMMU scalable mode by default")
Signed-off-by: Lu Baolu <[email protected]>
---
drivers/iommu/intel/iommu.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 493b6a600394..681789b1258d 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -1501,7 +1501,7 @@ static void iommu_flush_iotlb_psi(struct intel_iommu *iommu,
else
__iommu_flush_iotlb_psi(iommu, did, pfn, pages, ih);

- if (!cap_caching_mode(iommu->cap) && !map)
+ if (!map)
iommu_flush_dev_iotlb(domain, addr, mask);
}

@@ -1575,8 +1575,7 @@ static void intel_flush_iotlb_all(struct iommu_domain *domain)
iommu->flush.flush_iotlb(iommu, did, 0, 0,
DMA_TLB_DSI_FLUSH);

- if (!cap_caching_mode(iommu->cap))
- iommu_flush_dev_iotlb(dmar_domain, 0, MAX_AGAW_PFN_WIDTH);
+ iommu_flush_dev_iotlb(dmar_domain, 0, MAX_AGAW_PFN_WIDTH);
}

if (dmar_domain->nested_parent)
--
2.34.1


2024-04-08 07:22:07

by Ethan Zhao

[permalink] [raw]
Subject: Re: [PATCH 2/2] iommu/vt-d: Remove caching mode check before devtlb flush

On 4/7/2024 10:42 PM, Lu Baolu wrote:
> The Caching Mode (CM) of the Intel IOMMU indicates if the hardware
> implementation caches not-present or erroneous translation-structure
> entries except the first-stage translation. The caching mode is
> unrelated to the device TLB , therefore there is no need to check
> it before a device TLB invalidation operation.
>
> Before the scalable mode is introduced, caching mode is treated as
> an indication that the driver is running in a VM guest. This is just
> a software contract as shadow page table is the only way to implement
> a virtual IOMMU. But the VT-d spec doesn't state this anywhere. After
> the scalable mode is introduced, this doesn't stand for anymore, as
> caching mode is not relevant for the first-stage translation. A virtual
> IOMMU implementation is free to support first-stage translation only
> with caching mode cleared.
>
> Remove the caching mode check before device TLB invalidation to ensure
> compatibility with the scalable mode use cases.
>
> Fixes: 792fb43ce2c9 ("iommu/vt-d: Enable Intel IOMMU scalable mode by default")
> Signed-off-by: Lu Baolu <[email protected]>
> ---
> drivers/iommu/intel/iommu.c | 5 ++---
> 1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index 493b6a600394..681789b1258d 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -1501,7 +1501,7 @@ static void iommu_flush_iotlb_psi(struct intel_iommu *iommu,
> else
> __iommu_flush_iotlb_psi(iommu, did, pfn, pages, ih);
>
> - if (!cap_caching_mode(iommu->cap) && !map)
> + if (!map)

My understanding, we don't need patch[1/2] at all, and customer is just asking
about the CM & tlb flushing, it is great to have this commit [2/2].


Thanks,
Ethan

> iommu_flush_dev_iotlb(domain, addr, mask);
> }
>
> @@ -1575,8 +1575,7 @@ static void intel_flush_iotlb_all(struct iommu_domain *domain)
> iommu->flush.flush_iotlb(iommu, did, 0, 0,
> DMA_TLB_DSI_FLUSH);
>
> - if (!cap_caching_mode(iommu->cap))
> - iommu_flush_dev_iotlb(dmar_domain, 0, MAX_AGAW_PFN_WIDTH);
> + iommu_flush_dev_iotlb(dmar_domain, 0, MAX_AGAW_PFN_WIDTH);
> }
>
> if (dmar_domain->nested_parent)

2024-04-08 07:24:11

by Baolu Lu

[permalink] [raw]
Subject: Re: [PATCH 2/2] iommu/vt-d: Remove caching mode check before devtlb flush

On 2024/4/8 15:21, Ethan Zhao wrote:
> On 4/7/2024 10:42 PM, Lu Baolu wrote:
>> The Caching Mode (CM) of the Intel IOMMU indicates if the hardware
>> implementation caches not-present or erroneous translation-structure
>> entries except the first-stage translation. The caching mode is
>> unrelated to the device TLB , therefore there is no need to check
>> it before a device TLB invalidation operation.
>>
>> Before the scalable mode is introduced, caching mode is treated as
>> an indication that the driver is running in a VM guest. This is just
>> a software contract as shadow page table is the only way to implement
>> a virtual IOMMU. But the VT-d spec doesn't state this anywhere. After
>> the scalable mode is introduced, this doesn't stand for anymore, as
>> caching mode is not relevant for the first-stage translation. A virtual
>> IOMMU implementation is free to support first-stage translation only
>> with caching mode cleared.
>>
>> Remove the caching mode check before device TLB invalidation to ensure
>> compatibility with the scalable mode use cases.
>>
>> Fixes: 792fb43ce2c9 ("iommu/vt-d: Enable Intel IOMMU scalable mode by
>> default")
>> Signed-off-by: Lu Baolu <[email protected]>
>> ---
>>   drivers/iommu/intel/iommu.c | 5 ++---
>>   1 file changed, 2 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
>> index 493b6a600394..681789b1258d 100644
>> --- a/drivers/iommu/intel/iommu.c
>> +++ b/drivers/iommu/intel/iommu.c
>> @@ -1501,7 +1501,7 @@ static void iommu_flush_iotlb_psi(struct
>> intel_iommu *iommu,
>>       else
>>           __iommu_flush_iotlb_psi(iommu, did, pfn, pages, ih);
>> -    if (!cap_caching_mode(iommu->cap) && !map)
>> +    if (!map)
>
> My understanding, we don't need patch[1/2] at all, and customer is just
> asking
> about the CM & tlb flushing, it is great to have this commit [2/2].

Actually they fix different problems.

Best regards,
baolu

2024-04-08 07:47:00

by Ethan Zhao

[permalink] [raw]
Subject: Re: [PATCH 2/2] iommu/vt-d: Remove caching mode check before devtlb flush

On 4/8/2024 3:23 PM, Baolu Lu wrote:
> On 2024/4/8 15:21, Ethan Zhao wrote:
>> On 4/7/2024 10:42 PM, Lu Baolu wrote:
>>> The Caching Mode (CM) of the Intel IOMMU indicates if the hardware
>>> implementation caches not-present or erroneous translation-structure
>>> entries except the first-stage translation. The caching mode is
>>> unrelated to the device TLB , therefore there is no need to check
>>> it before a device TLB invalidation operation.
>>>
>>> Before the scalable mode is introduced, caching mode is treated as
>>> an indication that the driver is running in a VM guest. This is just
>>> a software contract as shadow page table is the only way to implement
>>> a virtual IOMMU. But the VT-d spec doesn't state this anywhere. After
>>> the scalable mode is introduced, this doesn't stand for anymore, as
>>> caching mode is not relevant for the first-stage translation. A virtual
>>> IOMMU implementation is free to support first-stage translation only
>>> with caching mode cleared.
>>>
>>> Remove the caching mode check before device TLB invalidation to ensure
>>> compatibility with the scalable mode use cases.
>>>
>>> Fixes: 792fb43ce2c9 ("iommu/vt-d: Enable Intel IOMMU scalable mode
>>> by default")
>>> Signed-off-by: Lu Baolu <[email protected]>
>>> ---
>>>   drivers/iommu/intel/iommu.c | 5 ++---
>>>   1 file changed, 2 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
>>> index 493b6a600394..681789b1258d 100644
>>> --- a/drivers/iommu/intel/iommu.c
>>> +++ b/drivers/iommu/intel/iommu.c
>>> @@ -1501,7 +1501,7 @@ static void iommu_flush_iotlb_psi(struct
>>> intel_iommu *iommu,
>>>       else
>>>           __iommu_flush_iotlb_psi(iommu, did, pfn, pages, ih);
>>> -    if (!cap_caching_mode(iommu->cap) && !map)
>>> +    if (!map)
>>
>> My understanding, we don't need patch[1/2] at all, and customer is
>> just asking
>> about the CM & tlb flushing, it is great to have this commit [2/2].
>
> Actually they fix different problems.

Yup, progress view.

Thanks,
Ethan

>
> Best regards,
> baolu

2024-04-08 20:59:13

by Jacob Pan

[permalink] [raw]
Subject: Re: [PATCH 2/2] iommu/vt-d: Remove caching mode check before devtlb flush

Hi Lu,

On Sun, 7 Apr 2024 22:42:32 +0800, Lu Baolu <[email protected]>
wrote:

> The Caching Mode (CM) of the Intel IOMMU indicates if the hardware
> implementation caches not-present or erroneous translation-structure
> entries except the first-stage translation. The caching mode is
> unrelated to the device TLB , therefore there is no need to check
> it before a device TLB invalidation operation.
>
> Before the scalable mode is introduced, caching mode is treated as
> an indication that the driver is running in a VM guest. This is just
> a software contract as shadow page table is the only way to implement
> a virtual IOMMU. But the VT-d spec doesn't state this anywhere. After
> the scalable mode is introduced, this doesn't stand for anymore, as
> caching mode is not relevant for the first-stage translation. A virtual
> IOMMU implementation is free to support first-stage translation only
> with caching mode cleared.
>
> Remove the caching mode check before device TLB invalidation to ensure
> compatibility with the scalable mode use cases.
>
I agree with the changes below, what about this CM check:

/* Notification for newly created mappings */
static void __mapping_notify_one(struct intel_iommu *iommu, struct dmar_domain *domain,
unsigned long pfn, unsigned int pages)
{
/*
* It's a non-present to present mapping. Only flush if caching mode
* and second level.
*/
if (cap_caching_mode(iommu->cap) && !domain->use_first_level)
iommu_flush_iotlb_psi(iommu, domain, pfn, pages, 0, 1);

We are still tying devTLB flush to CM=1, no?

If we are running in the guest with second level page table (shadowed), can
we decide if devTLB flush is needed based on ATS enable just as the rest of
the cases?


> Fixes: 792fb43ce2c9 ("iommu/vt-d: Enable Intel IOMMU scalable mode by
> default") Signed-off-by: Lu Baolu <[email protected]>
> ---
> drivers/iommu/intel/iommu.c | 5 ++---
> 1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index 493b6a600394..681789b1258d 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -1501,7 +1501,7 @@ static void iommu_flush_iotlb_psi(struct
> intel_iommu *iommu, else
> __iommu_flush_iotlb_psi(iommu, did, pfn, pages, ih);
>
> - if (!cap_caching_mode(iommu->cap) && !map)
> + if (!map)
> iommu_flush_dev_iotlb(domain, addr, mask);
> }
>
> @@ -1575,8 +1575,7 @@ static void intel_flush_iotlb_all(struct
> iommu_domain *domain) iommu->flush.flush_iotlb(iommu, did, 0, 0,
> DMA_TLB_DSI_FLUSH);
>
> - if (!cap_caching_mode(iommu->cap))
> - iommu_flush_dev_iotlb(dmar_domain, 0,
> MAX_AGAW_PFN_WIDTH);
> + iommu_flush_dev_iotlb(dmar_domain, 0,
> MAX_AGAW_PFN_WIDTH); }
>
> if (dmar_domain->nested_parent)


Thanks,

Jacob

2024-04-09 03:13:46

by Baolu Lu

[permalink] [raw]
Subject: Re: [PATCH 2/2] iommu/vt-d: Remove caching mode check before devtlb flush

On 4/9/24 5:03 AM, Jacob Pan wrote:
> Hi Lu,

Hi Jacob,

>
> On Sun, 7 Apr 2024 22:42:32 +0800, Lu Baolu<[email protected]>
> wrote:
>
>> The Caching Mode (CM) of the Intel IOMMU indicates if the hardware
>> implementation caches not-present or erroneous translation-structure
>> entries except the first-stage translation. The caching mode is
>> unrelated to the device TLB , therefore there is no need to check
>> it before a device TLB invalidation operation.
>>
>> Before the scalable mode is introduced, caching mode is treated as
>> an indication that the driver is running in a VM guest. This is just
>> a software contract as shadow page table is the only way to implement
>> a virtual IOMMU. But the VT-d spec doesn't state this anywhere. After
>> the scalable mode is introduced, this doesn't stand for anymore, as
>> caching mode is not relevant for the first-stage translation. A virtual
>> IOMMU implementation is free to support first-stage translation only
>> with caching mode cleared.
>>
>> Remove the caching mode check before device TLB invalidation to ensure
>> compatibility with the scalable mode use cases.
>>
> I agree with the changes below, what about this CM check:
>
> /* Notification for newly created mappings */
> static void __mapping_notify_one(struct intel_iommu *iommu, struct dmar_domain *domain,
> unsigned long pfn, unsigned int pages)
> {
> /*
> * It's a non-present to present mapping. Only flush if caching mode
> * and second level.
> */
> if (cap_caching_mode(iommu->cap) && !domain->use_first_level)
> iommu_flush_iotlb_psi(iommu, domain, pfn, pages, 0, 1);
>
> We are still tying devTLB flush to CM=1, no?

__mapping_notify_one() is called in the path where some PTEs are changed
from non-present to present.

In this scenario,

- if CM is set and first-stage translation is not used, the IOTLB caches
are required to be explicitly flushed.
- else if hardware requires write buffer flushing, do it.
- Otherwise, no op.
- devtlb invalidation is irrelevant to this path.

The code after the fix appears to do the right thing. devTLB is not
invalidated in iommu_flush_iotlb_psi() since it's a map (map == 1).

Or perhaps I overlooked anything?

>
> If we are running in the guest with second level page table (shadowed), can
> we decide if devTLB flush is needed based on ATS enable just as the rest of
> the cases?

I think the ATS check should be consistent. It's generic no matter how
the IOMMU is implemented (in hardware or emulated in software).

Best regards,
baolu

2024-04-09 07:20:32

by Tian, Kevin

[permalink] [raw]
Subject: RE: [PATCH 1/2] iommu/vt-d: Avoid unnecessary device TLB flush in map path

> From: Lu Baolu <[email protected]>
> Sent: Sunday, April 7, 2024 10:43 PM
>
> iommu_flush_iotlb_psi() is called in map and unmap paths. The caching
> mode check before device TLB invalidation will cause device TLB
> invalidation always issued if IOMMU is not running in the caching mode.
> This is inefficient and causes performance overhead.

s/inefficient/wrong/

'inefficient' means that it's a right thing to do but just inefficient compared
to other options. But here by definition it's not required and caching mode
is irrelevant in the context of devtlb.

>
> Make device TLB invalidation behavior consistent between batched mode
> unmapping and strict mode unmapping. Device TLB invalidation should only

I don't quite understand the connection to batch vs. strict.

> be requested in the unmap path if the IOMMU is not in caching mode.

I'll remove the part about caching mode as it's irrelevant.

>
> Fixes: bf92df30df90 ("intel-iommu: Only avoid flushing device IOTLB for
> domain ID 0 in caching mode")
> Signed-off-by: Lu Baolu <[email protected]>
> ---
> drivers/iommu/intel/iommu.c | 6 +-----
> 1 file changed, 1 insertion(+), 5 deletions(-)
>
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index 50eb9aed47cc..493b6a600394 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -1501,11 +1501,7 @@ static void iommu_flush_iotlb_psi(struct
> intel_iommu *iommu,
> else
> __iommu_flush_iotlb_psi(iommu, did, pfn, pages, ih);
>
> - /*
> - * In caching mode, changes of pages from non-present to present
> require
> - * flush. However, device IOTLB doesn't need to be flushed in this
> case.
> - */
> - if (!cap_caching_mode(iommu->cap) || !map)
> + if (!cap_caching_mode(iommu->cap) && !map)
> iommu_flush_dev_iotlb(domain, addr, mask);

It's a bit strange to do this half-baked way and then rely on next patch
to do the right thing. It temporarily removes all devtlb invalidations
for caching mode here and then add them back for unmap in next patch.

caching mode has nothing to do with devtlb. So I'd just do:

if (!map)
iommu_flush_dev_iotlb(domain, addr, mask);



2024-04-09 07:21:48

by Tian, Kevin

[permalink] [raw]
Subject: RE: [PATCH 1/2] iommu/vt-d: Avoid unnecessary device TLB flush in map path

> From: Lu Baolu <[email protected]>
> Sent: Sunday, April 7, 2024 10:43 PM
>
> iommu_flush_iotlb_psi() is called in map and unmap paths. The caching
> mode check before device TLB invalidation will cause device TLB
> invalidation always issued if IOMMU is not running in the caching mode.
> This is inefficient and causes performance overhead.
>
> Make device TLB invalidation behavior consistent between batched mode
> unmapping and strict mode unmapping. Device TLB invalidation should only
> be requested in the unmap path if the IOMMU is not in caching mode.
>
> Fixes: bf92df30df90 ("intel-iommu: Only avoid flushing device IOTLB for
> domain ID 0 in caching mode")
> Signed-off-by: Lu Baolu <[email protected]>

and given it only affects performance then not qualified for backporting?

2024-04-09 07:30:31

by Tian, Kevin

[permalink] [raw]
Subject: RE: [PATCH 2/2] iommu/vt-d: Remove caching mode check before devtlb flush

> From: Lu Baolu <[email protected]>
> Sent: Sunday, April 7, 2024 10:43 PM
>
> The Caching Mode (CM) of the Intel IOMMU indicates if the hardware
> implementation caches not-present or erroneous translation-structure
> entries except the first-stage translation. The caching mode is
> unrelated to the device TLB , therefore there is no need to check
> it before a device TLB invalidation operation.
>
> Before the scalable mode is introduced, caching mode is treated as
> an indication that the driver is running in a VM guest. This is just
> a software contract as shadow page table is the only way to implement
> a virtual IOMMU. But the VT-d spec doesn't state this anywhere. After
> the scalable mode is introduced, this doesn't stand for anymore, as
> caching mode is not relevant for the first-stage translation. A virtual
> IOMMU implementation is free to support first-stage translation only
> with caching mode cleared.

I didn't get the connection to the scalable mode.

if required we can still use caching mode to imply running as a guest.
Just need to make sure its implementation conforming to the VT-d spec.

>
> Remove the caching mode check before device TLB invalidation to ensure
> compatibility with the scalable mode use cases.
>
> Fixes: 792fb43ce2c9 ("iommu/vt-d: Enable Intel IOMMU scalable mode by
> default")
> Signed-off-by: Lu Baolu <[email protected]>
> ---
> drivers/iommu/intel/iommu.c | 5 ++---
> 1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index 493b6a600394..681789b1258d 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -1501,7 +1501,7 @@ static void iommu_flush_iotlb_psi(struct
> intel_iommu *iommu,
> else
> __iommu_flush_iotlb_psi(iommu, did, pfn, pages, ih);
>
> - if (!cap_caching_mode(iommu->cap) && !map)
> + if (!map)
> iommu_flush_dev_iotlb(domain, addr, mask);

as commented earlier better squash this in patch1.

> }
>
> @@ -1575,8 +1575,7 @@ static void intel_flush_iotlb_all(struct
> iommu_domain *domain)
> iommu->flush.flush_iotlb(iommu, did, 0, 0,
> DMA_TLB_DSI_FLUSH);
>
> - if (!cap_caching_mode(iommu->cap))
> - iommu_flush_dev_iotlb(dmar_domain, 0,
> MAX_AGAW_PFN_WIDTH);
> + iommu_flush_dev_iotlb(dmar_domain, 0,
> MAX_AGAW_PFN_WIDTH);
> }
>

I'm hesitating to agree with this change. Strictly speaking it's correct.
but w/o supporting batch invalidation this implies performance drop
on viommu due to more VM-exits and there may incur user complaints
when their VMs upgrade to a newer kernel version.

So it'd be better to keep this behavior and fix it together with batch
invalidation support. Anyway none of the viommu implementations
today (either shadow or nested translation) relies on the correct
devtlb behavior from the guest otherwise it's already broken.

2024-04-09 08:25:26

by Yi Liu

[permalink] [raw]
Subject: Re: [PATCH 1/2] iommu/vt-d: Avoid unnecessary device TLB flush in map path

On 2024/4/7 22:42, Lu Baolu wrote:
> iommu_flush_iotlb_psi() is called in map and unmap paths. The caching
> mode check before device TLB invalidation will cause device TLB
> invalidation always issued if IOMMU is not running in the caching mode.
> This is inefficient and causes performance overhead.
>
> Make device TLB invalidation behavior consistent between batched mode
> unmapping and strict mode unmapping. Device TLB invalidation should only
> be requested in the unmap path if the IOMMU is not in caching mode.
>
> Fixes: bf92df30df90 ("intel-iommu: Only avoid flushing device IOTLB for domain ID 0 in caching mode")
> Signed-off-by: Lu Baolu <[email protected]>
> ---
> drivers/iommu/intel/iommu.c | 6 +-----
> 1 file changed, 1 insertion(+), 5 deletions(-)
>
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index 50eb9aed47cc..493b6a600394 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -1501,11 +1501,7 @@ static void iommu_flush_iotlb_psi(struct intel_iommu *iommu,
> else
> __iommu_flush_iotlb_psi(iommu, did, pfn, pages, ih);
>
> - /*
> - * In caching mode, changes of pages from non-present to present require
> - * flush. However, device IOTLB doesn't need to be flushed in this case.
> - */
> - if (!cap_caching_mode(iommu->cap) || !map)
> + if (!cap_caching_mode(iommu->cap) && !map)
> iommu_flush_dev_iotlb(domain, addr, mask);
> }
>

The existing code works but kind of confusing. The iommu_flush_iotlb_psi()
helper will be called in both the map and unmap path. But device-TLB only
needed to be flushed in the unmap path since there is no chance for
device ATC to have cache for a non-present mapping. And only when caching
mode is reported, then should the helper be called in the map path. So the
fact is if caching mode is 0, the @map should always be false. If caching
mode is 1, then @map can be either false or true. To be simpler, it should
be enough to check @map before flushing device-TLB. no matter caching mode
or not.

--
Regards,
Yi Liu

2024-04-09 08:40:04

by Yi Liu

[permalink] [raw]
Subject: Re: [PATCH 2/2] iommu/vt-d: Remove caching mode check before devtlb flush

On 2024/4/7 22:42, Lu Baolu wrote:
> The Caching Mode (CM) of the Intel IOMMU indicates if the hardware
> implementation caches not-present or erroneous translation-structure
> entries except the first-stage translation. The caching mode is
> unrelated to the device TLB , therefore there is no need to check
> it before a device TLB invalidation operation.
>
> Before the scalable mode is introduced, caching mode is treated as
> an indication that the driver is running in a VM guest. This is just
> a software contract as shadow page table is the only way to implement
> a virtual IOMMU. But the VT-d spec doesn't state this anywhere. After
> the scalable mode is introduced, this doesn't stand for anymore, as
> caching mode is not relevant for the first-stage translation. A virtual
> IOMMU implementation is free to support first-stage translation only
> with caching mode cleared.
>
> Remove the caching mode check before device TLB invalidation to ensure
> compatibility with the scalable mode use cases.
>
> Fixes: 792fb43ce2c9 ("iommu/vt-d: Enable Intel IOMMU scalable mode by default")
> Signed-off-by: Lu Baolu <[email protected]>
> ---
> drivers/iommu/intel/iommu.c | 5 ++---
> 1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index 493b6a600394..681789b1258d 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -1501,7 +1501,7 @@ static void iommu_flush_iotlb_psi(struct intel_iommu *iommu,
> else
> __iommu_flush_iotlb_psi(iommu, did, pfn, pages, ih);
>
> - if (!cap_caching_mode(iommu->cap) && !map)
> + if (!map)
> iommu_flush_dev_iotlb(domain, addr, mask);
> }
>

IMHO. this change can be merged with patch 01. And the reason is to make
the logic simple and clear.

> @@ -1575,8 +1575,7 @@ static void intel_flush_iotlb_all(struct iommu_domain *domain)
> iommu->flush.flush_iotlb(iommu, did, 0, 0,
> DMA_TLB_DSI_FLUSH);
>
> - if (!cap_caching_mode(iommu->cap))
> - iommu_flush_dev_iotlb(dmar_domain, 0, MAX_AGAW_PFN_WIDTH);
> + iommu_flush_dev_iotlb(dmar_domain, 0, MAX_AGAW_PFN_WIDTH);
> }
>
> if (dmar_domain->nested_parent)

you can also move the iommu_flush_dev_iotlb() call out of the iommu_array
loop since it does not require checking iommu cap.

--
Regards,
Yi Liu

2024-04-09 17:27:24

by Jacob Pan

[permalink] [raw]
Subject: Re: [PATCH 2/2] iommu/vt-d: Remove caching mode check before devtlb flush

Hi Baolu,

On Tue, 9 Apr 2024 11:12:20 +0800, Baolu Lu <[email protected]>
wrote:

> On 4/9/24 5:03 AM, Jacob Pan wrote:
> > Hi Lu,
>
> Hi Jacob,
>
> >
> > On Sun, 7 Apr 2024 22:42:32 +0800, Lu Baolu<[email protected]>
> > wrote:
> >
> >> The Caching Mode (CM) of the Intel IOMMU indicates if the hardware
> >> implementation caches not-present or erroneous translation-structure
> >> entries except the first-stage translation. The caching mode is
> >> unrelated to the device TLB , therefore there is no need to check
> >> it before a device TLB invalidation operation.
> >>
> >> Before the scalable mode is introduced, caching mode is treated as
> >> an indication that the driver is running in a VM guest. This is just
> >> a software contract as shadow page table is the only way to implement
> >> a virtual IOMMU. But the VT-d spec doesn't state this anywhere. After
> >> the scalable mode is introduced, this doesn't stand for anymore, as
> >> caching mode is not relevant for the first-stage translation. A virtual
> >> IOMMU implementation is free to support first-stage translation only
> >> with caching mode cleared.
> >>
> >> Remove the caching mode check before device TLB invalidation to ensure
> >> compatibility with the scalable mode use cases.
> >>
> > I agree with the changes below, what about this CM check:
> >
> > /* Notification for newly created mappings */
> > static void __mapping_notify_one(struct intel_iommu *iommu, struct
> > dmar_domain *domain, unsigned long pfn, unsigned int pages)
> > {
> > /*
> > * It's a non-present to present mapping. Only flush if caching
> > mode
> > * and second level.
> > */
> > if (cap_caching_mode(iommu->cap) && !domain->use_first_level)
> > iommu_flush_iotlb_psi(iommu, domain, pfn, pages, 0, 1);
> >
> > We are still tying devTLB flush to CM=1, no?
>
> __mapping_notify_one() is called in the path where some PTEs are changed
> from non-present to present.
>
> In this scenario,
>
> - if CM is set and first-stage translation is not used, the IOTLB caches
> are required to be explicitly flushed.
> - else if hardware requires write buffer flushing, do it.
> - Otherwise, no op.
> - devtlb invalidation is irrelevant to this path.
>
> The code after the fix appears to do the right thing. devTLB is not
> invalidated in iommu_flush_iotlb_psi() since it's a map (map == 1).
>
> Or perhaps I overlooked anything?
My confusion is that, on one side, this patch is saying devTLB flush has
nothing to do with CM. But here, if CMD==1, we don't flush devTLB since
map==1.

If the guest uses SL page tables in vIOMMU, we don;t expose ATS to the
guest. So ATS is not relevant here, does't matter map or unmap.

Can we remove the map argument in iommu_flush_iotlb_psi(iommu, domain,pfn,
pages, 0, 1)?

Then devTLB flush will naturally be skipped in the guest (CM=1, SL) since
ATS is not enabled.
iommu_flush_dev_iotlb(domain, addr, mask);

i.e.

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 50eb9aed47cc..ee3e5a1af0c5 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -1483,7 +1483,7 @@ static void __iommu_flush_iotlb_psi(struct intel_iommu *iommu, u16 did,
static void iommu_flush_iotlb_psi(struct intel_iommu *iommu,
struct dmar_domain *domain,
unsigned long pfn, unsigned int pages,
- int ih, int map)
+ int ih)
{
unsigned int aligned_pages = __roundup_pow_of_two(pages);
unsigned int mask = ilog2(aligned_pages);
@@ -1501,12 +1501,7 @@ static void iommu_flush_iotlb_psi(struct intel_iommu *iommu,
else
__iommu_flush_iotlb_psi(iommu, did, pfn, pages, ih);

- /*
- * In caching mode, changes of pages from non-present to present require
- * flush. However, device IOTLB doesn't need to be flushed in this case.
- */
- if (!cap_caching_mode(iommu->cap) || !map)
- iommu_flush_dev_iotlb(domain, addr, mask);
+ iommu_flush_dev_iotlb(domain, addr, mask);
}

/* Notification for newly created mappings */
@@ -1518,7 +1513,7 @@ static void __mapping_notify_one(struct intel_iommu *iommu, struct dmar_domain *
* and second level.
*/
if (cap_caching_mode(iommu->cap) && !domain->use_first_level)
- iommu_flush_iotlb_psi(iommu, domain, pfn, pages, 0, 1);
+ iommu_flush_iotlb_psi(iommu, domain, pfn, pages, 0);
else
iommu_flush_write_buffer(iommu);
}



> >
> > If we are running in the guest with second level page table (shadowed),
> > can we decide if devTLB flush is needed based on ATS enable just as the
> > rest of the cases?
>
> I think the ATS check should be consistent. It's generic no matter how
> the IOMMU is implemented (in hardware or emulated in software).
>
> Best regards,
> baolu


Thanks,

Jacob

2024-04-10 00:32:18

by Tian, Kevin

[permalink] [raw]
Subject: RE: [PATCH 2/2] iommu/vt-d: Remove caching mode check before devtlb flush

> From: Jacob Pan <[email protected]>
> Sent: Wednesday, April 10, 2024 1:32 AM
>
> If the guest uses SL page tables in vIOMMU, we don;t expose ATS to the
> guest. So ATS is not relevant here, does't matter map or unmap.
>

ATS is orthogonal to SL vs. FL. Where is this restriction coming from?

2024-04-10 05:41:31

by Baolu Lu

[permalink] [raw]
Subject: Re: [PATCH 2/2] iommu/vt-d: Remove caching mode check before devtlb flush

On 4/9/24 3:30 PM, Tian, Kevin wrote:
>> Remove the caching mode check before device TLB invalidation to ensure
>> compatibility with the scalable mode use cases.
>>
>> Fixes: 792fb43ce2c9 ("iommu/vt-d: Enable Intel IOMMU scalable mode by
>> default")
>> Signed-off-by: Lu Baolu<[email protected]>
>> ---
>> drivers/iommu/intel/iommu.c | 5 ++---
>> 1 file changed, 2 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
>> index 493b6a600394..681789b1258d 100644
>> --- a/drivers/iommu/intel/iommu.c
>> +++ b/drivers/iommu/intel/iommu.c
>> @@ -1501,7 +1501,7 @@ static void iommu_flush_iotlb_psi(struct
>> intel_iommu *iommu,
>> else
>> __iommu_flush_iotlb_psi(iommu, did, pfn, pages, ih);
>>
>> - if (!cap_caching_mode(iommu->cap) && !map)
>> + if (!map)
>> iommu_flush_dev_iotlb(domain, addr, mask);
> as commented earlier better squash this in patch1.

Okay, let me squash them into a single patch and make the commit message
more descriptive.

Best regards,
baolu

2024-04-10 16:20:06

by Jacob Pan

[permalink] [raw]
Subject: Re: [PATCH 2/2] iommu/vt-d: Remove caching mode check before devtlb flush

Hi Kevin,

On Wed, 10 Apr 2024 00:32:06 +0000, "Tian, Kevin" <[email protected]>
wrote:

> > From: Jacob Pan <[email protected]>
> > Sent: Wednesday, April 10, 2024 1:32 AM
> >
> > If the guest uses SL page tables in vIOMMU, we don;t expose ATS to the
> > guest. So ATS is not relevant here, does't matter map or unmap.
> >
>
> ATS is orthogonal to SL vs. FL. Where is this restriction coming from?
For practical purposes, what would be the usage to have SL in the guest and
ATS enabled. i.e. shadowing SL but directly expose ATS?

Thanks,

Jacob

2024-04-10 23:24:15

by Tian, Kevin

[permalink] [raw]
Subject: RE: [PATCH 2/2] iommu/vt-d: Remove caching mode check before devtlb flush

> From: Jacob Pan <[email protected]>
> Sent: Thursday, April 11, 2024 12:20 AM
>
> Hi Kevin,
>
> On Wed, 10 Apr 2024 00:32:06 +0000, "Tian, Kevin" <[email protected]>
> wrote:
>
> > > From: Jacob Pan <[email protected]>
> > > Sent: Wednesday, April 10, 2024 1:32 AM
> > >
> > > If the guest uses SL page tables in vIOMMU, we don;t expose ATS to the
> > > guest. So ATS is not relevant here, does't matter map or unmap.
> > >
> >
> > ATS is orthogonal to SL vs. FL. Where is this restriction coming from?
> For practical purposes, what would be the usage to have SL in the guest and
> ATS enabled. i.e. shadowing SL but directly expose ATS?
>

ATS is about the protocol between device and iommu to look up
translations. Why does it care about internal paging layout in
iommu?

I'm not sure which spec explicitly states such restriction.

2024-04-10 23:49:25

by Zhang, Tina

[permalink] [raw]
Subject: RE: [PATCH 2/2] iommu/vt-d: Remove caching mode check before devtlb flush

Hi,

> -----Original Message-----
> From: Tian, Kevin <[email protected]>
> Sent: Tuesday, April 9, 2024 3:30 PM
> To: Lu Baolu <[email protected]>; [email protected]
> Cc: Liu, Yi L <[email protected]>; Joerg Roedel <[email protected]>; Will
> Deacon <[email protected]>; Robin Murphy <[email protected]>; linux-
> [email protected]
> Subject: RE: [PATCH 2/2] iommu/vt-d: Remove caching mode check before
> devtlb flush
>
> > From: Lu Baolu <[email protected]>
> > Sent: Sunday, April 7, 2024 10:43 PM
> >
> > The Caching Mode (CM) of the Intel IOMMU indicates if the hardware
> > implementation caches not-present or erroneous translation-structure
> > entries except the first-stage translation. The caching mode is
> > unrelated to the device TLB , therefore there is no need to check it
> > before a device TLB invalidation operation.
> >
> > Before the scalable mode is introduced, caching mode is treated as an
> > indication that the driver is running in a VM guest. This is just a
> > software contract as shadow page table is the only way to implement a
> > virtual IOMMU. But the VT-d spec doesn't state this anywhere. After
> > the scalable mode is introduced, this doesn't stand for anymore, as
> > caching mode is not relevant for the first-stage translation. A
> > virtual IOMMU implementation is free to support first-stage
> > translation only with caching mode cleared.
>
> I didn't get the connection to the scalable mode.
>
> if required we can still use caching mode to imply running as a guest.
> Just need to make sure its implementation conforming to the VT-d spec.
>
> >
> > Remove the caching mode check before device TLB invalidation to ensure
> > compatibility with the scalable mode use cases.
> >
> > Fixes: 792fb43ce2c9 ("iommu/vt-d: Enable Intel IOMMU scalable mode by
> > default")
> > Signed-off-by: Lu Baolu <[email protected]>
> > ---
> > drivers/iommu/intel/iommu.c | 5 ++---
> > 1 file changed, 2 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> > index 493b6a600394..681789b1258d 100644
> > --- a/drivers/iommu/intel/iommu.c
> > +++ b/drivers/iommu/intel/iommu.c
> > @@ -1501,7 +1501,7 @@ static void iommu_flush_iotlb_psi(struct
> > intel_iommu *iommu,
> > else
> > __iommu_flush_iotlb_psi(iommu, did, pfn, pages, ih);
> >
> > - if (!cap_caching_mode(iommu->cap) && !map)
> > + if (!map)
> > iommu_flush_dev_iotlb(domain, addr, mask);
>
> as commented earlier better squash this in patch1.
>
> > }
> >
> > @@ -1575,8 +1575,7 @@ static void intel_flush_iotlb_all(struct
> > iommu_domain *domain)
> > iommu->flush.flush_iotlb(iommu, did, 0, 0,
> > DMA_TLB_DSI_FLUSH);
> >
> > - if (!cap_caching_mode(iommu->cap))
> > - iommu_flush_dev_iotlb(dmar_domain, 0,
> > MAX_AGAW_PFN_WIDTH);
> > + iommu_flush_dev_iotlb(dmar_domain, 0,
> > MAX_AGAW_PFN_WIDTH);
> > }
> >
>
> I'm hesitating to agree with this change. Strictly speaking it's correct.
> but w/o supporting batch invalidation this implies performance drop on
> viommu due to more VM-exits and there may incur user complaints when
> their VMs upgrade to a newer kernel version.
>
> So it'd be better to keep this behavior and fix it together with batch
> invalidation support. Anyway none of the viommu implementations today
> (either shadow or nested translation) relies on the correct devtlb behavior
> from the guest otherwise it's already broken.
How about we split this change into a patch. I'm working on the batch invalidation patch-set now and I'm happy to include this code change into the batch invalidation series.

Regards,
-Tina


2024-04-11 12:15:55

by Baolu Lu

[permalink] [raw]
Subject: Re: [PATCH 2/2] iommu/vt-d: Remove caching mode check before devtlb flush

On 2024/4/11 7:49, Zhang, Tina wrote:
> Hi,
>
>> -----Original Message-----
>> From: Tian, Kevin <[email protected]>
>> Sent: Tuesday, April 9, 2024 3:30 PM
>> To: Lu Baolu <[email protected]>; [email protected]
>> Cc: Liu, Yi L <[email protected]>; Joerg Roedel <[email protected]>; Will
>> Deacon <[email protected]>; Robin Murphy <[email protected]>; linux-
>> [email protected]
>> Subject: RE: [PATCH 2/2] iommu/vt-d: Remove caching mode check before
>> devtlb flush
>>
>>> From: Lu Baolu <[email protected]>
>>> Sent: Sunday, April 7, 2024 10:43 PM
>>>
>>> The Caching Mode (CM) of the Intel IOMMU indicates if the hardware
>>> implementation caches not-present or erroneous translation-structure
>>> entries except the first-stage translation. The caching mode is
>>> unrelated to the device TLB , therefore there is no need to check it
>>> before a device TLB invalidation operation.
>>>
>>> Before the scalable mode is introduced, caching mode is treated as an
>>> indication that the driver is running in a VM guest. This is just a
>>> software contract as shadow page table is the only way to implement a
>>> virtual IOMMU. But the VT-d spec doesn't state this anywhere. After
>>> the scalable mode is introduced, this doesn't stand for anymore, as
>>> caching mode is not relevant for the first-stage translation. A
>>> virtual IOMMU implementation is free to support first-stage
>>> translation only with caching mode cleared.
>>
>> I didn't get the connection to the scalable mode.
>>
>> if required we can still use caching mode to imply running as a guest.
>> Just need to make sure its implementation conforming to the VT-d spec.
>>
>>>
>>> Remove the caching mode check before device TLB invalidation to ensure
>>> compatibility with the scalable mode use cases.
>>>
>>> Fixes: 792fb43ce2c9 ("iommu/vt-d: Enable Intel IOMMU scalable mode by
>>> default")
>>> Signed-off-by: Lu Baolu <[email protected]>
>>> ---
>>> drivers/iommu/intel/iommu.c | 5 ++---
>>> 1 file changed, 2 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
>>> index 493b6a600394..681789b1258d 100644
>>> --- a/drivers/iommu/intel/iommu.c
>>> +++ b/drivers/iommu/intel/iommu.c
>>> @@ -1501,7 +1501,7 @@ static void iommu_flush_iotlb_psi(struct
>>> intel_iommu *iommu,
>>> else
>>> __iommu_flush_iotlb_psi(iommu, did, pfn, pages, ih);
>>>
>>> - if (!cap_caching_mode(iommu->cap) && !map)
>>> + if (!map)
>>> iommu_flush_dev_iotlb(domain, addr, mask);
>>
>> as commented earlier better squash this in patch1.
>>
>>> }
>>>
>>> @@ -1575,8 +1575,7 @@ static void intel_flush_iotlb_all(struct
>>> iommu_domain *domain)
>>> iommu->flush.flush_iotlb(iommu, did, 0, 0,
>>> DMA_TLB_DSI_FLUSH);
>>>
>>> - if (!cap_caching_mode(iommu->cap))
>>> - iommu_flush_dev_iotlb(dmar_domain, 0,
>>> MAX_AGAW_PFN_WIDTH);
>>> + iommu_flush_dev_iotlb(dmar_domain, 0,
>>> MAX_AGAW_PFN_WIDTH);
>>> }
>>>
>>
>> I'm hesitating to agree with this change. Strictly speaking it's correct.
>> but w/o supporting batch invalidation this implies performance drop on
>> viommu due to more VM-exits and there may incur user complaints when
>> their VMs upgrade to a newer kernel version.
>>
>> So it'd be better to keep this behavior and fix it together with batch
>> invalidation support. Anyway none of the viommu implementations today
>> (either shadow or nested translation) relies on the correct devtlb behavior
>> from the guest otherwise it's already broken.
> How about we split this change into a patch. I'm working on the batch invalidation patch-set now and I'm happy to include this code change into the batch invalidation series.

No worries. It turned out that these two checks are just unnecessary.
Removing them won't cause any driver behavior change. So, I will make it
a cleanup patch and target it for the next merge window.

Best regards,
baolu

2024-04-11 21:08:35

by Jacob Pan

[permalink] [raw]
Subject: Re: [PATCH 2/2] iommu/vt-d: Remove caching mode check before devtlb flush

Hi Kevin,

On Wed, 10 Apr 2024 23:23:57 +0000, "Tian, Kevin" <[email protected]>
wrote:

> > From: Jacob Pan <[email protected]>
> > Sent: Thursday, April 11, 2024 12:20 AM
> >
> > Hi Kevin,
> >
> > On Wed, 10 Apr 2024 00:32:06 +0000, "Tian, Kevin" <[email protected]>
> > wrote:
> >
> > > > From: Jacob Pan <[email protected]>
> > > > Sent: Wednesday, April 10, 2024 1:32 AM
> > > >
> > > > If the guest uses SL page tables in vIOMMU, we don;t expose ATS to
> > > > the guest. So ATS is not relevant here, does't matter map or unmap.
> > > >
> > >
> > > ATS is orthogonal to SL vs. FL. Where is this restriction coming
> > > from?
> > For practical purposes, what would be the usage to have SL in the guest
> > and ATS enabled. i.e. shadowing SL but directly expose ATS?
> >
>
> ATS is about the protocol between device and iommu to look up
> translations. Why does it care about internal paging layout in
> iommu?
>
Maybe the original intent was missed, I was suggesting the devTLB flush
should be based on ATS cap (as you said here) not map/unmap.

- /*
- * In caching mode, changes of pages from non-present to present require
- * flush. However, device IOTLB doesn't need to be flushed in this case.
- */
- if (!cap_caching_mode(iommu->cap) || !map)
- iommu_flush_dev_iotlb(domain, addr, mask);
+ iommu_flush_dev_iotlb(domain, addr, mask);


Thanks,

Jacob

2024-04-12 03:17:14

by Tian, Kevin

[permalink] [raw]
Subject: RE: [PATCH 2/2] iommu/vt-d: Remove caching mode check before devtlb flush

> From: Jacob Pan <[email protected]>
> Sent: Friday, April 12, 2024 12:18 AM
>
> Hi Kevin,
>
> On Wed, 10 Apr 2024 23:23:57 +0000, "Tian, Kevin" <[email protected]>
> wrote:
>
> > > From: Jacob Pan <[email protected]>
> > > Sent: Thursday, April 11, 2024 12:20 AM
> > >
> > > Hi Kevin,
> > >
> > > On Wed, 10 Apr 2024 00:32:06 +0000, "Tian, Kevin"
> <[email protected]>
> > > wrote:
> > >
> > > > > From: Jacob Pan <[email protected]>
> > > > > Sent: Wednesday, April 10, 2024 1:32 AM
> > > > >
> > > > > If the guest uses SL page tables in vIOMMU, we don;t expose ATS to
> > > > > the guest. So ATS is not relevant here, does't matter map or unmap.
> > > > >
> > > >
> > > > ATS is orthogonal to SL vs. FL. Where is this restriction coming
> > > > from?
> > > For practical purposes, what would be the usage to have SL in the guest
> > > and ATS enabled. i.e. shadowing SL but directly expose ATS?
> > >
> >
> > ATS is about the protocol between device and iommu to look up
> > translations. Why does it care about internal paging layout in
> > iommu?
> >
> Maybe the original intent was missed, I was suggesting the devTLB flush
> should be based on ATS cap (as you said here) not map/unmap.
>
> - /*
> - * In caching mode, changes of pages from non-present to present
> require
> - * flush. However, device IOTLB doesn't need to be flushed in this case.
> - */
> - if (!cap_caching_mode(iommu->cap) || !map)
> - iommu_flush_dev_iotlb(domain, addr, mask);
> + iommu_flush_dev_iotlb(domain, addr, mask);
>

We need check both, as devtlb doesn't cache non-present
so the invalidation is required only for unmap.

Here just the check of caching mode is irrelevant.