2021-07-09 03:36:17

by David Stevens

[permalink] [raw]
Subject: [PATCH v2 0/4] Fixes for dma-iommu swiotlb bounce buffers

From: David Stevens <[email protected]>

This patch set includes two fixes for bugs caused by mixing up the
original buffer's physical address and bounce buffer's physical address.
It also includes a performance fix that avoids an extra copy, as well as
a general cleanup fix.

The issues were found via code inspection, so I don't have any specific
use cases where things were not working, or any performance numbers.

v1 -> v2:
- Split fixes into dedicated patches
- Less invasive changes to fix arch_sync when mapping
- Leave dev_is_untrusted check for strict iommu

David Stevens (4):
dma-iommu: fix sync_sg with swiotlb
dma-iommu: fix arch_sync_dma for map with swiotlb
dma-iommu: pass SKIP_CPU_SYNC to swiotlb unmap
dma-iommu: Check CONFIG_SWIOTLB more broadly

drivers/iommu/dma-iommu.c | 63 ++++++++++++++++++++-------------------
1 file changed, 33 insertions(+), 30 deletions(-)

--
2.32.0.93.g670b81a890-goog


2021-07-09 03:36:17

by David Stevens

[permalink] [raw]
Subject: [PATCH v2 1/4] dma-iommu: fix sync_sg with swiotlb

From: David Stevens <[email protected]>

The is_swiotlb_buffer function takes the physical address of the swiotlb
buffer, not the physical address of the original buffer. The sglist
contains the physical addresses of the original buffer, so for the
sync_sg functions to work properly when a bounce buffer might have been
used, we need to use iommu_iova_to_phys to look up the physical address.
This is what sync_single does, so call that function on each sglist
segment.

The previous code mostly worked because swiotlb does the transfer on map
and unmap. However, any callers which use DMA_ATTR_SKIP_CPU_SYNC with
sglists or which call sync_sg would not have had anything copied to the
bounce buffer.

Fixes: 82612d66d51d ("iommu: Allow the dma-iommu api to use bounce buffers")
Signed-off-by: David Stevens <[email protected]>
---
drivers/iommu/dma-iommu.c | 26 +++++++++++++-------------
1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 7bcdd1205535..eac65302439e 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -811,14 +811,14 @@ static void iommu_dma_sync_sg_for_cpu(struct device *dev,
if (dev_is_dma_coherent(dev) && !dev_is_untrusted(dev))
return;

- for_each_sg(sgl, sg, nelems, i) {
- if (!dev_is_dma_coherent(dev))
- arch_sync_dma_for_cpu(sg_phys(sg), sg->length, dir);
-
- if (is_swiotlb_buffer(sg_phys(sg)))
+ if (dev_is_untrusted(dev))
+ for_each_sg(sgl, sg, nelems, i)
+ iommu_dma_sync_single_for_cpu(dev, sg_dma_address(sg),
+ sg->length, dir);
+ else
+ for_each_sg(sgl, sg, nelems, i)
swiotlb_sync_single_for_cpu(dev, sg_phys(sg),
sg->length, dir);
- }
}

static void iommu_dma_sync_sg_for_device(struct device *dev,
@@ -831,14 +831,14 @@ static void iommu_dma_sync_sg_for_device(struct device *dev,
if (dev_is_dma_coherent(dev) && !dev_is_untrusted(dev))
return;

- for_each_sg(sgl, sg, nelems, i) {
- if (is_swiotlb_buffer(sg_phys(sg)))
- swiotlb_sync_single_for_device(dev, sg_phys(sg),
- sg->length, dir);
-
- if (!dev_is_dma_coherent(dev))
+ if (dev_is_untrusted(dev))
+ for_each_sg(sgl, sg, nelems, i)
+ iommu_dma_sync_single_for_device(dev,
+ sg_dma_address(sg),
+ sg->length, dir);
+ else
+ for_each_sg(sgl, sg, nelems, i)
arch_sync_dma_for_device(sg_phys(sg), sg->length, dir);
- }
}

static dma_addr_t iommu_dma_map_page(struct device *dev, struct page *page,
--
2.32.0.93.g670b81a890-goog

2021-07-09 03:36:17

by David Stevens

[permalink] [raw]
Subject: [PATCH v2 2/4] dma-iommu: fix arch_sync_dma for map with swiotlb

From: David Stevens <[email protected]>

When calling arch_sync_dma, we need to pass it the memory that's
actually being used for dma. When using swiotlb bounce buffers, this is
the bounce buffer. Move arch_sync_dma into the __iommu_dma_map_swiotlb
helper, so it can use the bounce buffer address if necessary. This also
means it is no longer necessary to call iommu_dma_sync_sg_for_device in
iommu_dma_map_sg for untrusted devices.

Fixes: 82612d66d51d ("iommu: Allow the dma-iommu api to use bounce buffers")
Signed-off-by: David Stevens <[email protected]>
---
drivers/iommu/dma-iommu.c | 16 +++++++---------
1 file changed, 7 insertions(+), 9 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index eac65302439e..e79e274d2dc5 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -574,6 +574,9 @@ static dma_addr_t __iommu_dma_map_swiotlb(struct device *dev, phys_addr_t phys,
memset(padding_start, 0, padding_size);
}

+ if (!coherent && !(attrs & DMA_ATTR_SKIP_CPU_SYNC))
+ arch_sync_dma_for_device(phys, org_size, dir);
+
iova = __iommu_dma_map(dev, phys, aligned_size, prot, dma_mask);
if (iova == DMA_MAPPING_ERROR && is_swiotlb_buffer(phys))
swiotlb_tbl_unmap_single(dev, phys, org_size, dir, attrs);
@@ -847,14 +850,9 @@ static dma_addr_t iommu_dma_map_page(struct device *dev, struct page *page,
{
phys_addr_t phys = page_to_phys(page) + offset;
bool coherent = dev_is_dma_coherent(dev);
- dma_addr_t dma_handle;

- dma_handle = __iommu_dma_map_swiotlb(dev, phys, size, dma_get_mask(dev),
+ return __iommu_dma_map_swiotlb(dev, phys, size, dma_get_mask(dev),
coherent, dir, attrs);
- if (!coherent && !(attrs & DMA_ATTR_SKIP_CPU_SYNC) &&
- dma_handle != DMA_MAPPING_ERROR)
- arch_sync_dma_for_device(phys, size, dir);
- return dma_handle;
}

static void iommu_dma_unmap_page(struct device *dev, dma_addr_t dma_handle,
@@ -997,12 +995,12 @@ static int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg,
iommu_deferred_attach(dev, domain))
return 0;

- if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
- iommu_dma_sync_sg_for_device(dev, sg, nents, dir);
-
if (dev_is_untrusted(dev))
return iommu_dma_map_sg_swiotlb(dev, sg, nents, dir, attrs);

+ if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
+ iommu_dma_sync_sg_for_device(dev, sg, nents, dir);
+
/*
* Work out how much IOVA space we need, and align the segments to
* IOVA granules for the IOMMU driver to handle. With some clever
--
2.32.0.93.g670b81a890-goog

2021-07-09 03:37:25

by David Stevens

[permalink] [raw]
Subject: [PATCH v2 3/4] dma-iommu: pass SKIP_CPU_SYNC to swiotlb unmap

From: David Stevens <[email protected]>

If SKIP_CPU_SYNC isn't already set, then iommu_dma_unmap_(page|sg) has
already called iommu_dma_sync_(single|sg)_for_cpu, so there is no need
to copy from the bounce buffer again.

Signed-off-by: David Stevens <[email protected]>
---
drivers/iommu/dma-iommu.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index e79e274d2dc5..0a9a9a343e64 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -505,7 +505,8 @@ static void __iommu_dma_unmap_swiotlb(struct device *dev, dma_addr_t dma_addr,
__iommu_dma_unmap(dev, dma_addr, size);

if (unlikely(is_swiotlb_buffer(phys)))
- swiotlb_tbl_unmap_single(dev, phys, size, dir, attrs);
+ swiotlb_tbl_unmap_single(dev, phys, size, dir,
+ attrs | DMA_ATTR_SKIP_CPU_SYNC);
}

static dma_addr_t __iommu_dma_map(struct device *dev, phys_addr_t phys,
--
2.32.0.93.g670b81a890-goog

2021-07-09 03:37:57

by David Stevens

[permalink] [raw]
Subject: [PATCH v2 4/4] dma-iommu: Check CONFIG_SWIOTLB more broadly

From: David Stevens <[email protected]>

Introduce a new dev_use_swiotlb function to guard swiotlb code, instead
of overloading dev_is_untrusted. This allows CONFIG_SWIOTLB to be
checked more broadly, so the swiotlb related code can be removed more
aggressively.

Signed-off-by: David Stevens <[email protected]>
---
drivers/iommu/dma-iommu.c | 24 ++++++++++++++----------
1 file changed, 14 insertions(+), 10 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 0a9a9a343e64..d8a0764c69aa 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -315,6 +315,11 @@ static bool dev_is_untrusted(struct device *dev)
return dev_is_pci(dev) && to_pci_dev(dev)->untrusted;
}

+static bool dev_use_swiotlb(struct device *dev)
+{
+ return IS_ENABLED(CONFIG_SWIOTLB) && dev_is_untrusted(dev);
+}
+
/**
* iommu_dma_init_domain - Initialise a DMA mapping domain
* @domain: IOMMU domain previously prepared by iommu_get_dma_cookie()
@@ -552,8 +557,7 @@ static dma_addr_t __iommu_dma_map_swiotlb(struct device *dev, phys_addr_t phys,
* If both the physical buffer start address and size are
* page aligned, we don't need to use a bounce page.
*/
- if (IS_ENABLED(CONFIG_SWIOTLB) && dev_is_untrusted(dev) &&
- iova_offset(iovad, phys | org_size)) {
+ if (dev_use_swiotlb(dev) && iova_offset(iovad, phys | org_size)) {
aligned_size = iova_align(iovad, org_size);
phys = swiotlb_tbl_map_single(dev, phys, org_size,
aligned_size, dir, attrs);
@@ -778,7 +782,7 @@ static void iommu_dma_sync_single_for_cpu(struct device *dev,
{
phys_addr_t phys;

- if (dev_is_dma_coherent(dev) && !dev_is_untrusted(dev))
+ if (dev_is_dma_coherent(dev) && !dev_use_swiotlb(dev))
return;

phys = iommu_iova_to_phys(iommu_get_dma_domain(dev), dma_handle);
@@ -794,7 +798,7 @@ static void iommu_dma_sync_single_for_device(struct device *dev,
{
phys_addr_t phys;

- if (dev_is_dma_coherent(dev) && !dev_is_untrusted(dev))
+ if (dev_is_dma_coherent(dev) && !dev_use_swiotlb(dev))
return;

phys = iommu_iova_to_phys(iommu_get_dma_domain(dev), dma_handle);
@@ -812,10 +816,10 @@ static void iommu_dma_sync_sg_for_cpu(struct device *dev,
struct scatterlist *sg;
int i;

- if (dev_is_dma_coherent(dev) && !dev_is_untrusted(dev))
+ if (dev_is_dma_coherent(dev) && !dev_use_swiotlb(dev))
return;

- if (dev_is_untrusted(dev))
+ if (dev_use_swiotlb(dev))
for_each_sg(sgl, sg, nelems, i)
iommu_dma_sync_single_for_cpu(dev, sg_dma_address(sg),
sg->length, dir);
@@ -832,10 +836,10 @@ static void iommu_dma_sync_sg_for_device(struct device *dev,
struct scatterlist *sg;
int i;

- if (dev_is_dma_coherent(dev) && !dev_is_untrusted(dev))
+ if (dev_is_dma_coherent(dev) && !dev_use_swiotlb(dev))
return;

- if (dev_is_untrusted(dev))
+ if (dev_use_swiotlb(dev))
for_each_sg(sgl, sg, nelems, i)
iommu_dma_sync_single_for_device(dev,
sg_dma_address(sg),
@@ -996,7 +1000,7 @@ static int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg,
iommu_deferred_attach(dev, domain))
return 0;

- if (dev_is_untrusted(dev))
+ if (dev_use_swiotlb(dev))
return iommu_dma_map_sg_swiotlb(dev, sg, nents, dir, attrs);

if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
@@ -1071,7 +1075,7 @@ static void iommu_dma_unmap_sg(struct device *dev, struct scatterlist *sg,
if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
iommu_dma_sync_sg_for_cpu(dev, sg, nents, dir);

- if (dev_is_untrusted(dev)) {
+ if (dev_use_swiotlb(dev)) {
iommu_dma_unmap_sg_swiotlb(dev, sg, nents, dir, attrs);
return;
}
--
2.32.0.93.g670b81a890-goog

2021-08-02 13:32:06

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH v2 1/4] dma-iommu: fix sync_sg with swiotlb

On Fri, Jul 09, 2021 at 12:34:59PM +0900, David Stevens wrote:
> From: David Stevens <[email protected]>
>
> The is_swiotlb_buffer function takes the physical address of the swiotlb
> buffer, not the physical address of the original buffer. The sglist
> contains the physical addresses of the original buffer, so for the
> sync_sg functions to work properly when a bounce buffer might have been
> used, we need to use iommu_iova_to_phys to look up the physical address.
> This is what sync_single does, so call that function on each sglist
> segment.
>
> The previous code mostly worked because swiotlb does the transfer on map
> and unmap. However, any callers which use DMA_ATTR_SKIP_CPU_SYNC with
> sglists or which call sync_sg would not have had anything copied to the
> bounce buffer.
>
> Fixes: 82612d66d51d ("iommu: Allow the dma-iommu api to use bounce buffers")
> Signed-off-by: David Stevens <[email protected]>
> ---
> drivers/iommu/dma-iommu.c | 26 +++++++++++++-------------
> 1 file changed, 13 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index 7bcdd1205535..eac65302439e 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -811,14 +811,14 @@ static void iommu_dma_sync_sg_for_cpu(struct device *dev,
> if (dev_is_dma_coherent(dev) && !dev_is_untrusted(dev))
> return;
>
> - for_each_sg(sgl, sg, nelems, i) {
> - if (!dev_is_dma_coherent(dev))
> - arch_sync_dma_for_cpu(sg_phys(sg), sg->length, dir);
> -
> - if (is_swiotlb_buffer(sg_phys(sg)))
> + if (dev_is_untrusted(dev))
> + for_each_sg(sgl, sg, nelems, i)
> + iommu_dma_sync_single_for_cpu(dev, sg_dma_address(sg),
> + sg->length, dir);
> + else
> + for_each_sg(sgl, sg, nelems, i)
> swiotlb_sync_single_for_cpu(dev, sg_phys(sg),
> sg->length, dir);

Doesn't this skip arch_sync_dma_for_cpu() for non-coherent trusted devices?

Why not skip the extra dev_is_untrusted(dev) call here and just call
iommu_dma_sync_single_for_cpu() for each entry regardless?

Will

2021-08-02 13:43:02

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH v2 2/4] dma-iommu: fix arch_sync_dma for map with swiotlb

On Fri, Jul 09, 2021 at 12:35:00PM +0900, David Stevens wrote:
> From: David Stevens <[email protected]>
>
> When calling arch_sync_dma, we need to pass it the memory that's
> actually being used for dma. When using swiotlb bounce buffers, this is
> the bounce buffer. Move arch_sync_dma into the __iommu_dma_map_swiotlb
> helper, so it can use the bounce buffer address if necessary. This also
> means it is no longer necessary to call iommu_dma_sync_sg_for_device in
> iommu_dma_map_sg for untrusted devices.
>
> Fixes: 82612d66d51d ("iommu: Allow the dma-iommu api to use bounce buffers")
> Signed-off-by: David Stevens <[email protected]>
> ---
> drivers/iommu/dma-iommu.c | 16 +++++++---------
> 1 file changed, 7 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index eac65302439e..e79e274d2dc5 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -574,6 +574,9 @@ static dma_addr_t __iommu_dma_map_swiotlb(struct device *dev, phys_addr_t phys,
> memset(padding_start, 0, padding_size);
> }
>
> + if (!coherent && !(attrs & DMA_ATTR_SKIP_CPU_SYNC))
> + arch_sync_dma_for_device(phys, org_size, dir);

I think this relies on the swiotlb buffers residing in the linear mapping
(i.e. where phys_to_virt() is reliable), which doesn't look like a safe
assumption to me.

Will

2021-08-02 13:57:26

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH v2 2/4] dma-iommu: fix arch_sync_dma for map with swiotlb

On Mon, Aug 02, 2021 at 02:40:59PM +0100, Will Deacon wrote:
> On Fri, Jul 09, 2021 at 12:35:00PM +0900, David Stevens wrote:
> > From: David Stevens <[email protected]>
> >
> > When calling arch_sync_dma, we need to pass it the memory that's
> > actually being used for dma. When using swiotlb bounce buffers, this is
> > the bounce buffer. Move arch_sync_dma into the __iommu_dma_map_swiotlb
> > helper, so it can use the bounce buffer address if necessary. This also
> > means it is no longer necessary to call iommu_dma_sync_sg_for_device in
> > iommu_dma_map_sg for untrusted devices.
> >
> > Fixes: 82612d66d51d ("iommu: Allow the dma-iommu api to use bounce buffers")
> > Signed-off-by: David Stevens <[email protected]>
> > ---
> > drivers/iommu/dma-iommu.c | 16 +++++++---------
> > 1 file changed, 7 insertions(+), 9 deletions(-)
> >
> > diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> > index eac65302439e..e79e274d2dc5 100644
> > --- a/drivers/iommu/dma-iommu.c
> > +++ b/drivers/iommu/dma-iommu.c
> > @@ -574,6 +574,9 @@ static dma_addr_t __iommu_dma_map_swiotlb(struct device *dev, phys_addr_t phys,
> > memset(padding_start, 0, padding_size);
> > }
> >
> > + if (!coherent && !(attrs & DMA_ATTR_SKIP_CPU_SYNC))
> > + arch_sync_dma_for_device(phys, org_size, dir);
>
> I think this relies on the swiotlb buffers residing in the linear mapping
> (i.e. where phys_to_virt() is reliable), which doesn't look like a safe
> assumption to me.

No, sorry, ignore me here. I misread swiotlb_bounce(), so I think your
change is good.

As an aside, it strikes me that we'd probably be better off using
uncacheable bounce buffers for non-coherent devices so we could avoid all
this maintenance entirely.

Will

2021-08-02 14:14:02

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH v2 3/4] dma-iommu: pass SKIP_CPU_SYNC to swiotlb unmap

On Fri, Jul 09, 2021 at 12:35:01PM +0900, David Stevens wrote:
> From: David Stevens <[email protected]>
>
> If SKIP_CPU_SYNC isn't already set, then iommu_dma_unmap_(page|sg) has
> already called iommu_dma_sync_(single|sg)_for_cpu, so there is no need
> to copy from the bounce buffer again.
>
> Signed-off-by: David Stevens <[email protected]>
> ---
> drivers/iommu/dma-iommu.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index e79e274d2dc5..0a9a9a343e64 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -505,7 +505,8 @@ static void __iommu_dma_unmap_swiotlb(struct device *dev, dma_addr_t dma_addr,
> __iommu_dma_unmap(dev, dma_addr, size);
>
> if (unlikely(is_swiotlb_buffer(phys)))
> - swiotlb_tbl_unmap_single(dev, phys, size, dir, attrs);
> + swiotlb_tbl_unmap_single(dev, phys, size, dir,
> + attrs | DMA_ATTR_SKIP_CPU_SYNC);
> }

I think it would be cleaner to drop DMA_ATTR_SKIP_CPU_SYNC in the callers
once they've called iommu_dma_sync_*_for_cpu().

Will

2021-08-05 05:37:49

by David Stevens

[permalink] [raw]
Subject: Re: [PATCH v2 3/4] dma-iommu: pass SKIP_CPU_SYNC to swiotlb unmap

On Mon, Aug 2, 2021 at 10:54 PM Will Deacon <[email protected]> wrote:
>
> On Fri, Jul 09, 2021 at 12:35:01PM +0900, David Stevens wrote:
> > From: David Stevens <[email protected]>
> >
> > If SKIP_CPU_SYNC isn't already set, then iommu_dma_unmap_(page|sg) has
> > already called iommu_dma_sync_(single|sg)_for_cpu, so there is no need
> > to copy from the bounce buffer again.
> >
> > Signed-off-by: David Stevens <[email protected]>
> > ---
> > drivers/iommu/dma-iommu.c | 3 ++-
> > 1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> > index e79e274d2dc5..0a9a9a343e64 100644
> > --- a/drivers/iommu/dma-iommu.c
> > +++ b/drivers/iommu/dma-iommu.c
> > @@ -505,7 +505,8 @@ static void __iommu_dma_unmap_swiotlb(struct device *dev, dma_addr_t dma_addr,
> > __iommu_dma_unmap(dev, dma_addr, size);
> >
> > if (unlikely(is_swiotlb_buffer(phys)))
> > - swiotlb_tbl_unmap_single(dev, phys, size, dir, attrs);
> > + swiotlb_tbl_unmap_single(dev, phys, size, dir,
> > + attrs | DMA_ATTR_SKIP_CPU_SYNC);
> > }
>
> I think it would be cleaner to drop DMA_ATTR_SKIP_CPU_SYNC in the callers
> once they've called iommu_dma_sync_*_for_cpu().

Dropping that flag in iommu_dma_unmap_* would result in always copying
from the swiotlb here, which is the opposite direction of what this
patch is trying to do.

This change is aiming to address the case where DMA_ATTR_SKIP_CPU_SYNC
isn't passed to dma_unmap_*. In that case, there are calls to
swiotlb_sync_single_for_cpu from iommu_dma_sync_*_for_cpu, and calls
to swiotlb_tlb_unmap_single here. That means we copy from the swiotlb
twice. Adding the DMA_ATTR_SKIP_CPU_SYNC flag here skips the second
copy.

-David

> Will

2021-08-05 06:46:35

by David Stevens

[permalink] [raw]
Subject: Re: [PATCH v2 1/4] dma-iommu: fix sync_sg with swiotlb

On Mon, Aug 2, 2021 at 10:30 PM Will Deacon <[email protected]> wrote:
>
> On Fri, Jul 09, 2021 at 12:34:59PM +0900, David Stevens wrote:
> > From: David Stevens <[email protected]>
> >
> > The is_swiotlb_buffer function takes the physical address of the swiotlb
> > buffer, not the physical address of the original buffer. The sglist
> > contains the physical addresses of the original buffer, so for the
> > sync_sg functions to work properly when a bounce buffer might have been
> > used, we need to use iommu_iova_to_phys to look up the physical address.
> > This is what sync_single does, so call that function on each sglist
> > segment.
> >
> > The previous code mostly worked because swiotlb does the transfer on map
> > and unmap. However, any callers which use DMA_ATTR_SKIP_CPU_SYNC with
> > sglists or which call sync_sg would not have had anything copied to the
> > bounce buffer.
> >
> > Fixes: 82612d66d51d ("iommu: Allow the dma-iommu api to use bounce buffers")
> > Signed-off-by: David Stevens <[email protected]>
> > ---
> > drivers/iommu/dma-iommu.c | 26 +++++++++++++-------------
> > 1 file changed, 13 insertions(+), 13 deletions(-)
> >
> > diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> > index 7bcdd1205535..eac65302439e 100644
> > --- a/drivers/iommu/dma-iommu.c
> > +++ b/drivers/iommu/dma-iommu.c
> > @@ -811,14 +811,14 @@ static void iommu_dma_sync_sg_for_cpu(struct device *dev,
> > if (dev_is_dma_coherent(dev) && !dev_is_untrusted(dev))
> > return;
> >
> > - for_each_sg(sgl, sg, nelems, i) {
> > - if (!dev_is_dma_coherent(dev))
> > - arch_sync_dma_for_cpu(sg_phys(sg), sg->length, dir);
> > -
> > - if (is_swiotlb_buffer(sg_phys(sg)))
> > + if (dev_is_untrusted(dev))
> > + for_each_sg(sgl, sg, nelems, i)
> > + iommu_dma_sync_single_for_cpu(dev, sg_dma_address(sg),
> > + sg->length, dir);
> > + else
> > + for_each_sg(sgl, sg, nelems, i)
> > swiotlb_sync_single_for_cpu(dev, sg_phys(sg),
> > sg->length, dir);
>
> Doesn't this skip arch_sync_dma_for_cpu() for non-coherent trusted devices?

Whoops, this was supposed to be a call to arch_sync_dma_for_cpu, not
to swiotlb_sync_single_for_cpu. Similar to the sync_sg_for_device
case.

> Why not skip the extra dev_is_untrusted(dev) call here and just call
> iommu_dma_sync_single_for_cpu() for each entry regardless?

iommu_dma_sync_single_for_cpu calls iommu_iova_to_phys to translate
the dma_addr_t to a phys_addr_t. Since the physical address is readily
available, I think it's better to avoid that extra work.

> Will

2021-08-06 17:50:41

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH v2 3/4] dma-iommu: pass SKIP_CPU_SYNC to swiotlb unmap

On Thu, Aug 05, 2021 at 02:26:10PM +0900, David Stevens wrote:
> On Mon, Aug 2, 2021 at 10:54 PM Will Deacon <[email protected]> wrote:
> >
> > On Fri, Jul 09, 2021 at 12:35:01PM +0900, David Stevens wrote:
> > > From: David Stevens <[email protected]>
> > >
> > > If SKIP_CPU_SYNC isn't already set, then iommu_dma_unmap_(page|sg) has
> > > already called iommu_dma_sync_(single|sg)_for_cpu, so there is no need
> > > to copy from the bounce buffer again.
> > >
> > > Signed-off-by: David Stevens <[email protected]>
> > > ---
> > > drivers/iommu/dma-iommu.c | 3 ++-
> > > 1 file changed, 2 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> > > index e79e274d2dc5..0a9a9a343e64 100644
> > > --- a/drivers/iommu/dma-iommu.c
> > > +++ b/drivers/iommu/dma-iommu.c
> > > @@ -505,7 +505,8 @@ static void __iommu_dma_unmap_swiotlb(struct device *dev, dma_addr_t dma_addr,
> > > __iommu_dma_unmap(dev, dma_addr, size);
> > >
> > > if (unlikely(is_swiotlb_buffer(phys)))
> > > - swiotlb_tbl_unmap_single(dev, phys, size, dir, attrs);
> > > + swiotlb_tbl_unmap_single(dev, phys, size, dir,
> > > + attrs | DMA_ATTR_SKIP_CPU_SYNC);
> > > }
> >
> > I think it would be cleaner to drop DMA_ATTR_SKIP_CPU_SYNC in the callers
> > once they've called iommu_dma_sync_*_for_cpu().
>
> Dropping that flag in iommu_dma_unmap_* would result in always copying
> from the swiotlb here, which is the opposite direction of what this
> patch is trying to do.

Sorry, probably poor wording on my part. What I mean is, rather than add
DMA_ATTR_SKIP_CPU_SYNC here, how about having the callers include it
in attrs instead, since they're the ones doing the initial sync?

Will