2022-05-09 10:33:48

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: [PATCH v2] iommu/amd: Set translation valid bit only when IO page tables are in used

On AMD system with SNP enabled, IOMMU hardware checks the host translation
valid (TV) and guest translation valid (GV) bits in the device
table entry (DTE) before accessing the corresponded page tables.

However, current IOMMU driver sets the TV bit for all devices
regardless of whether the host page table is in used.
This results in ILLEGAL_DEV_TABLE_ENTRY event for devices, which
do not the host page table root pointer set up.

Thefore, only set TV bit when DMA remapping is not used, which is
when domain ID in the AMD IOMMU device table entry (DTE) is zero.

Signed-off-by: Suravee Suthikulpanit <[email protected]>
---
drivers/iommu/amd/init.c | 4 +---
drivers/iommu/amd/iommu.c | 8 ++++++--
2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 648d6b94ba8c..6a2dadf2b2dc 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -2336,10 +2336,8 @@ static void init_device_table_dma(void)
{
u32 devid;

- for (devid = 0; devid <= amd_iommu_last_bdf; ++devid) {
+ for (devid = 0; devid <= amd_iommu_last_bdf; ++devid)
set_dev_entry_bit(devid, DEV_ENTRY_VALID);
- set_dev_entry_bit(devid, DEV_ENTRY_TRANSLATION);
- }
}

static void __init uninit_device_table_dma(void)
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index a1ada7bff44e..cea254968f06 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -1473,7 +1473,7 @@ static void set_dte_entry(u16 devid, struct protection_domain *domain,

pte_root |= (domain->iop.mode & DEV_ENTRY_MODE_MASK)
<< DEV_ENTRY_MODE_SHIFT;
- pte_root |= DTE_FLAG_IR | DTE_FLAG_IW | DTE_FLAG_V | DTE_FLAG_TV;
+ pte_root |= DTE_FLAG_IR | DTE_FLAG_IW | DTE_FLAG_V;

flags = amd_iommu_dev_table[devid].data[1];

@@ -1513,6 +1513,10 @@ static void set_dte_entry(u16 devid, struct protection_domain *domain,
flags |= tmp;
}

+ /* Only set TV bit when IOMMU page translation is in used */
+ if (domain->id != 0)
+ pte_root |= DTE_FLAG_TV;
+
flags &= ~DEV_DOMID_MASK;
flags |= domain->id;

@@ -1535,7 +1539,7 @@ static void set_dte_entry(u16 devid, struct protection_domain *domain,
static void clear_dte_entry(u16 devid)
{
/* remove entry from the device table seen by the hardware */
- amd_iommu_dev_table[devid].data[0] = DTE_FLAG_V | DTE_FLAG_TV;
+ amd_iommu_dev_table[devid].data[0] = DTE_FLAG_V;
amd_iommu_dev_table[devid].data[1] &= DTE_FLAG_MASK;

amd_iommu_apply_erratum_63(devid);
--
2.25.1



2022-05-14 04:05:49

by Joerg Roedel

[permalink] [raw]
Subject: Re: [PATCH v2] iommu/amd: Set translation valid bit only when IO page tables are in used

On Mon, May 09, 2022 at 02:48:15AM -0500, Suravee Suthikulpanit wrote:
> On AMD system with SNP enabled, IOMMU hardware checks the host translation
> valid (TV) and guest translation valid (GV) bits in the device
> table entry (DTE) before accessing the corresponded page tables.
>
> However, current IOMMU driver sets the TV bit for all devices
> regardless of whether the host page table is in used.
> This results in ILLEGAL_DEV_TABLE_ENTRY event for devices, which
> do not the host page table root pointer set up.

Hmm, this sound weird. In the early AMD IOMMUs it was recommended to set
TV=1 and V=1 and the rest to 0 to block all DMA from a device.

I wonder how this triggers ILLEGAL_DEV_TABLE_ENTRY errors now. It is
(was?) legal to set V=1 TV=1, mode=0 and leave the page-table empty.
When then IW=0 and IR=0, DMA is blocked. From what I remember this is a
valid setting in a DTE. Do you have an example DTE which triggers this
error message?

Regards,

Joerg


2022-05-17 03:11:37

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: Re: [PATCH v2] iommu/amd: Set translation valid bit only when IO page tables are in used

Joerg,

On 5/13/22 8:07 PM, Joerg Roedel wrote:
> On Mon, May 09, 2022 at 02:48:15AM -0500, Suravee Suthikulpanit wrote:
>> On AMD system with SNP enabled, IOMMU hardware checks the host translation
>> valid (TV) and guest translation valid (GV) bits in the device
>> table entry (DTE) before accessing the corresponded page tables.
>>
>> However, current IOMMU driver sets the TV bit for all devices
>> regardless of whether the host page table is in used.
>> This results in ILLEGAL_DEV_TABLE_ENTRY event for devices, which
>> do not the host page table root pointer set up.
>
> Hmm, this sound weird. In the early AMD IOMMUs it was recommended to set
> TV=1 and V=1 and the rest to 0 to block all DMA from a device.
>
> I wonder how this triggers ILLEGAL_DEV_TABLE_ENTRY errors now. It is
> (was?) legal to set V=1 TV=1, mode=0 and leave the page-table empty.

Due to the new restriction (please see the IOMMU spec Rev 3.06-PUB - Apr 2021
https://www.amd.com/system/files/TechDocs/48882_IOMMU.pdf) where the use of
DTE[Mode]=0 is not supported on systems that are SNP-enabled (i.e. EFR[SNPSup]=1),
the IOMMU HW looks at the DTE[TV] bit to determine if it needs to handle the v1 page table.
When the HW encounters DTE entry with TV=1, V=1, Mode=0, it would generate
ILLEGAL_DEV_TABLE_ENTRY event.

Note: I am following up with HW folks for the updated document for this
specific detail.

Therefore, we need to modify IOMMU driver as following:

- For non-DMA devices (e.g. the IOAPIC devices), we need to
modify IOMMU driver to default to DTE[TV]=0. For Linux, this is equivalent
to DTE with domain ID 0.

- I am still trying to see what is the best way to force Linux to not allow
Mode=0 (i.e. iommu=pt mode). Any thoughts?

- Also, it seems that the current iommu v2 page table use case, where GVA->GPA=SPA
will no longer be supported on system w/ SNPSup=1. Any thoughts?

> When then IW=0 and IR=0, DMA is blocked. From what I remember this is a
> valid setting in a DTE.

Correct.

> Do you have an example DTE which triggers this error message?

This is specifically from the device representing an IOAPIC.

[ +0.000108] iommu ivhd0: AMD-Vi: Event logged [ILLEGAL_DEV_TABLE_ENTRY device=c0:00.1 pasid=0x00000
address=0xfffffffdf8140000 flags=0x0008]
[ +0.000011] AMD-Vi: DTE[0]: 0000000000000003
[ +0.000003] AMD-Vi: DTE[1]: 0000000000000000
[ +0.000002] AMD-Vi: DTE[2]: 2008000100258013
[ +0.000001] AMD-Vi: DTE[3]: 0000000000000000

Best Regards,
Suravee

2022-05-20 14:33:30

by Robin Murphy

[permalink] [raw]
Subject: Re: [PATCH v2] iommu/amd: Set translation valid bit only when IO page tables are in used

On 2022-05-20 09:58, Joerg Roedel wrote:
> On Fri, May 20, 2022 at 09:54:51AM +0100, Robin Murphy wrote:
>> The .def_domain type op already allows drivers to do exactly this sort of
>> override. You could also conditionally reject IOMMU_DOMAIN_PASSTHROUGH in
>> .domain_alloc for good measure, provided that (for now at least*) SNP is a
>> global thing rather than per-instance.
>
> Yeah, that could work. I am just not sure the IOMMU core behaves well in
> all situations when allocation IOMMU_DOMAIN_PASSTHROUGH suddenly starts
> to fail. I would feel better if this is checked and tested :)

Well, iommu_group_alloc_default_domain() has the fallback and is
currently the only place that __iommu_domain_alloc() can be called with
a type other than IOMMU_DOMAIN_UNMANAGED, so by inspection it should be
fine. However if iommu_get_def_domain_type() says the right thing then
neither sysfs nor automatic default domains should get as far as even
trying to allocate an identity domain anyway - note that that's already
what happens for untrusted external devices. But either way should be
easy enough to verify with a quick hack, too.

Cheers,
Robin.

2022-05-20 20:38:18

by Joerg Roedel

[permalink] [raw]
Subject: Re: [PATCH v2] iommu/amd: Set translation valid bit only when IO page tables are in used

On Fri, May 20, 2022 at 09:54:51AM +0100, Robin Murphy wrote:
> The .def_domain type op already allows drivers to do exactly this sort of
> override. You could also conditionally reject IOMMU_DOMAIN_PASSTHROUGH in
> .domain_alloc for good measure, provided that (for now at least*) SNP is a
> global thing rather than per-instance.

Yeah, that could work. I am just not sure the IOMMU core behaves well in
all situations when allocation IOMMU_DOMAIN_PASSTHROUGH suddenly starts
to fail. I would feel better if this is checked and tested :)

Regards,

Joerg

2022-05-23 06:15:07

by Robin Murphy

[permalink] [raw]
Subject: Re: [PATCH v2] iommu/amd: Set translation valid bit only when IO page tables are in used

On 2022-05-20 09:09, Joerg Roedel wrote:
> Hi Suravee,
>
> On Mon, May 16, 2022 at 07:27:51PM +0700, Suravee Suthikulpanit wrote:
>> Due to the new restriction (please see the IOMMU spec Rev 3.06-PUB - Apr 2021
>> https://www.amd.com/system/files/TechDocs/48882_IOMMU.pdf) where the use of
>> DTE[Mode]=0 is not supported on systems that are SNP-enabled (i.e. EFR[SNPSup]=1),
>> the IOMMU HW looks at the DTE[TV] bit to determine if it needs to handle the v1 page table.
>> When the HW encounters DTE entry with TV=1, V=1, Mode=0, it would generate
>> ILLEGAL_DEV_TABLE_ENTRY event.
>
> Ah, that is the part I was missing, thanks.
>
>> - I am still trying to see what is the best way to force Linux to not allow
>> Mode=0 (i.e. iommu=pt mode). Any thoughts?
>
> I think this needs a general approach. First start in the AMD IOMMU
> driver:
>
> 1) Do not set DTE.TV=1 bit iff SNP-Support is enabled
> 2) Fail to allocate passthrough domains when SNP support is
> enabled.
>
> Then test how the IOMMU core layer handles that. In fact the IOMMU layer
> needs to adjust its decisions so that:
>
> 1) It uses translated-mode by default
> 2) passthrough domains are disallowed and can not be chosen, not
> on the kernel command line and not at runtime.
>
> Ideally this needs some kind of arch-callback to set the appropriate
> defaults.

The .def_domain type op already allows drivers to do exactly this sort
of override. You could also conditionally reject
IOMMU_DOMAIN_PASSTHROUGH in .domain_alloc for good measure, provided
that (for now at least*) SNP is a global thing rather than per-instance.

Cheers,
Robin.

*Instance-aware .domain_alloc probably about 2 releases away at the
current pace of landing the tree-wide prep ;)

>> - Also, it seems that the current iommu v2 page table use case, where GVA->GPA=SPA
>> will no longer be supported on system w/ SNPSup=1. Any thoughts?
>
> Support for that is not upstream yet, it should be easy to disallow this
> configuration and just use the v1 page-tables when SNP is active. This
> can be handled entirely inside the AMD IOMMU driver.
>
> Regards,
>
> Joerg
> _______________________________________________
> iommu mailing list
> [email protected]
> https://lists.linuxfoundation.org/mailman/listinfo/iommu

2022-05-23 07:34:38

by Joerg Roedel

[permalink] [raw]
Subject: Re: [PATCH v2] iommu/amd: Set translation valid bit only when IO page tables are in used

Hi Suravee,

On Mon, May 16, 2022 at 07:27:51PM +0700, Suravee Suthikulpanit wrote:
> Due to the new restriction (please see the IOMMU spec Rev 3.06-PUB - Apr 2021
> https://www.amd.com/system/files/TechDocs/48882_IOMMU.pdf) where the use of
> DTE[Mode]=0 is not supported on systems that are SNP-enabled (i.e. EFR[SNPSup]=1),
> the IOMMU HW looks at the DTE[TV] bit to determine if it needs to handle the v1 page table.
> When the HW encounters DTE entry with TV=1, V=1, Mode=0, it would generate
> ILLEGAL_DEV_TABLE_ENTRY event.

Ah, that is the part I was missing, thanks.

> - I am still trying to see what is the best way to force Linux to not allow
> Mode=0 (i.e. iommu=pt mode). Any thoughts?

I think this needs a general approach. First start in the AMD IOMMU
driver:

1) Do not set DTE.TV=1 bit iff SNP-Support is enabled
2) Fail to allocate passthrough domains when SNP support is
enabled.

Then test how the IOMMU core layer handles that. In fact the IOMMU layer
needs to adjust its decisions so that:

1) It uses translated-mode by default
2) passthrough domains are disallowed and can not be chosen, not
on the kernel command line and not at runtime.

Ideally this needs some kind of arch-callback to set the appropriate
defaults.

> - Also, it seems that the current iommu v2 page table use case, where GVA->GPA=SPA
> will no longer be supported on system w/ SNPSup=1. Any thoughts?

Support for that is not upstream yet, it should be easy to disallow this
configuration and just use the v1 page-tables when SNP is active. This
can be handled entirely inside the AMD IOMMU driver.

Regards,

Joerg

2022-05-27 07:35:45

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: Re: [PATCH v2] iommu/amd: Set translation valid bit only when IO page tables are in used

Joerg,

On 5/20/22 3:09 PM, Joerg Roedel wrote:
> Hi Suravee,
>
> On Mon, May 16, 2022 at 07:27:51PM +0700, Suravee Suthikulpanit wrote:
>
>> - Also, it seems that the current iommu v2 page table use case, where GVA->GPA=SPA
>> will no longer be supported on system w/ SNPSup=1. Any thoughts?
>
> Support for that is not upstream yet, it should be easy to disallow this
> configuration and just use the v1 page-tables when SNP is active. This
> can be handled entirely inside the AMD IOMMU driver.
>

Actually, I am referring to when user uses the IOMMU v2 table for shared virtual address
in current iommu_v2 driver (e.g. amd_iommu_init_device(), amd_iommu_bind_pasid).

Best Regards,
Suravee

2022-06-07 11:52:05

by Joerg Roedel

[permalink] [raw]
Subject: Re: [PATCH v2] iommu/amd: Set translation valid bit only when IO page tables are in used

On Thu, May 26, 2022 at 10:29:09AM +0700, Suravee Suthikulpanit wrote:
> Actually, I am referring to when user uses the IOMMU v2 table for shared virtual address
> in current iommu_v2 driver (e.g. amd_iommu_init_device(), amd_iommu_bind_pasid).

From what I can see this is not handled yet and needs an additional
check. I think the best solution is to set iommu->iommu_v2 to false when
the SNP feature bit is set.
But that is probably not enough, all functions that are called from the
IOMMUv2 driver need to fail.

Regards,

Joerg