2024-05-03 18:37:59

by T.J. Mercier

[permalink] [raw]
Subject: [PATCH] dma-direct: Set SG_DMA_SWIOTLB flag for dma-direct

As of commit 861370f49ce4 ("iommu/dma: force bouncing if the size is
not cacheline-aligned") sg_dma_mark_swiotlb is called when
dma_map_sgtable takes the IOMMU path and uses SWIOTLB for some portion
of a scatterlist. It is never set for the direct path, so drivers
cannot always rely on sg_dma_is_swiotlb to return correctly after
calling dma_map_sgtable. Fix this by calling sg_dma_mark_swiotlb in the
direct path like it is in the IOMMU path.

Signed-off-by: T.J. Mercier <[email protected]>
---
kernel/dma/direct.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
index 4d543b1e9d57..52f0dcb25ca2 100644
--- a/kernel/dma/direct.c
+++ b/kernel/dma/direct.c
@@ -12,7 +12,7 @@
#include <linux/pfn.h>
#include <linux/vmalloc.h>
#include <linux/set_memory.h>
-#include <linux/slab.h>
+#include <linux/swiotlb.h>
#include "direct.h"

/*
@@ -497,6 +497,8 @@ int dma_direct_map_sg(struct device *dev, struct scatterlist *sgl, int nents,
goto out_unmap;
}
sg_dma_len(sg) = sg->length;
+ if (is_swiotlb_buffer(dev, dma_to_phys(dev, sg->dma_address)))
+ sg_dma_mark_swiotlb(sg);
}

return nents;
--
2.45.0.rc1.225.g2a3ae87e7f-goog



2024-05-04 08:53:38

by Petr Tesařík

[permalink] [raw]
Subject: Re: [PATCH] dma-direct: Set SG_DMA_SWIOTLB flag for dma-direct

On Fri, 3 May 2024 18:37:12 +0000
"T.J. Mercier" <[email protected]> wrote:

> As of commit 861370f49ce4 ("iommu/dma: force bouncing if the size is
> not cacheline-aligned") sg_dma_mark_swiotlb is called when
> dma_map_sgtable takes the IOMMU path and uses SWIOTLB for some portion
> of a scatterlist. It is never set for the direct path, so drivers
> cannot always rely on sg_dma_is_swiotlb to return correctly after
> calling dma_map_sgtable. Fix this by calling sg_dma_mark_swiotlb in the
> direct path like it is in the IOMMU path.
>
> Signed-off-by: T.J. Mercier <[email protected]>
> ---
> kernel/dma/direct.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
> index 4d543b1e9d57..52f0dcb25ca2 100644
> --- a/kernel/dma/direct.c
> +++ b/kernel/dma/direct.c
> @@ -12,7 +12,7 @@
> #include <linux/pfn.h>
> #include <linux/vmalloc.h>
> #include <linux/set_memory.h>
> -#include <linux/slab.h>
> +#include <linux/swiotlb.h>
> #include "direct.h"
>
> /*
> @@ -497,6 +497,8 @@ int dma_direct_map_sg(struct device *dev, struct scatterlist *sgl, int nents,
> goto out_unmap;
> }
> sg_dma_len(sg) = sg->length;
> + if (is_swiotlb_buffer(dev, dma_to_phys(dev, sg->dma_address)))
> + sg_dma_mark_swiotlb(sg);
> }
>
> return nents;

I'm not sure this does the right thing. IIUC when the scatterlist flags
include SG_DMA_SWIOTLB, iommu_dma_sync_sg_for_*() will call
iommu_dma_sync_single_for_*(), which in turn translates the DMA address
to a physical address using iommu_iova_to_phys(). It seems to me that
this function may not work correctly if there is no IOMMU, but it also
seems to me that the scatterlist may contain such non-IOMMU addresses.

I'm no expert, so correct DMA-to-physical translation might in fact be
somehow implicitly guaranteed. If that's the case, could you explain it
in the commit message, please?

Petr T

2024-05-06 05:30:16

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH] dma-direct: Set SG_DMA_SWIOTLB flag for dma-direct

On Fri, May 03, 2024 at 06:37:12PM +0000, T.J. Mercier wrote:
> As of commit 861370f49ce4 ("iommu/dma: force bouncing if the size is
> not cacheline-aligned") sg_dma_mark_swiotlb is called when
> dma_map_sgtable takes the IOMMU path and uses SWIOTLB for some portion
> of a scatterlist. It is never set for the direct path, so drivers
> cannot always rely on sg_dma_is_swiotlb to return correctly after
> calling dma_map_sgtable. Fix this by calling sg_dma_mark_swiotlb in the
> direct path like it is in the IOMMU path.

I don't think this is the right thing to do. Despite it's name
sg_dma_mark_swiotlb really is dma-iommu specific, and doesn't make sense
in context of dma-direct. If anything we need to find a better name
for the flag.


2024-05-06 16:02:48

by T.J. Mercier

[permalink] [raw]
Subject: Re: [PATCH] dma-direct: Set SG_DMA_SWIOTLB flag for dma-direct

On Sun, May 5, 2024 at 10:29 PM Christoph Hellwig <[email protected]> wrote:
>
> On Fri, May 03, 2024 at 06:37:12PM +0000, T.J. Mercier wrote:
> > As of commit 861370f49ce4 ("iommu/dma: force bouncing if the size is
> > not cacheline-aligned") sg_dma_mark_swiotlb is called when
> > dma_map_sgtable takes the IOMMU path and uses SWIOTLB for some portion
> > of a scatterlist. It is never set for the direct path, so drivers
> > cannot always rely on sg_dma_is_swiotlb to return correctly after
> > calling dma_map_sgtable. Fix this by calling sg_dma_mark_swiotlb in the
> > direct path like it is in the IOMMU path.
>
> I don't think this is the right thing to do. Despite it's name
> sg_dma_mark_swiotlb really is dma-iommu specific, and doesn't make sense
> in context of dma-direct. If anything we need to find a better name
> for the flag.
>

Oh, that's disappointing. I'm looking for a way to quickly check if
any addresses point at a SWIOTLB buffer without doing a potentially
expensive call to iommu_iova_to_phys. Since it's meant to be dma-iommu
only I guess I could use sg_dma_is_swiotlb if iommu_get_domain_for_dev
returns a domain, and is_swiotlb_buffer otherwise for dma-direct, but
it'd be nice to have just one way to check which it looked like the
SG_DMA_SWIOTLB flag could be used for.

2024-05-06 16:03:35

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH] dma-direct: Set SG_DMA_SWIOTLB flag for dma-direct

On Mon, May 06, 2024 at 09:00:59AM -0700, T.J. Mercier wrote:
> Oh, that's disappointing. I'm looking for a way to quickly check if
> any addresses point at a SWIOTLB buffer without doing a potentially
> expensive call to iommu_iova_to_phys. Since it's meant to be dma-iommu
> only I guess I could use sg_dma_is_swiotlb if iommu_get_domain_for_dev
> returns a domain, and is_swiotlb_buffer otherwise for dma-direct, but
> it'd be nice to have just one way to check which it looked like the
> SG_DMA_SWIOTLB flag could be used for.

This sounds like you're trying to do that from a consumer of the
DMA API, which is simply wrong. What is the actual problem you are
trying to solve?

2024-05-06 16:11:00

by T.J. Mercier

[permalink] [raw]
Subject: Re: [PATCH] dma-direct: Set SG_DMA_SWIOTLB flag for dma-direct

On Mon, May 6, 2024 at 9:02 AM Christoph Hellwig <[email protected]> wrote:
>
> On Mon, May 06, 2024 at 09:00:59AM -0700, T.J. Mercier wrote:
> > Oh, that's disappointing. I'm looking for a way to quickly check if
> > any addresses point at a SWIOTLB buffer without doing a potentially
> > expensive call to iommu_iova_to_phys. Since it's meant to be dma-iommu
> > only I guess I could use sg_dma_is_swiotlb if iommu_get_domain_for_dev
> > returns a domain, and is_swiotlb_buffer otherwise for dma-direct, but
> > it'd be nice to have just one way to check which it looked like the
> > SG_DMA_SWIOTLB flag could be used for.
>
> This sounds like you're trying to do that from a consumer of the
> DMA API, which is simply wrong. What is the actual problem you are
> trying to solve?

I want to reject mapping a dma_buf for a device if any of the pages
that back the buffer require SWIOTLB. AFAICT there's no way to know
whether SWIOTLB is used until after calling dma_map_sg, so afterwards
I'm trying to check.

2024-05-06 16:22:22

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH] dma-direct: Set SG_DMA_SWIOTLB flag for dma-direct

On Mon, May 06, 2024 at 09:10:40AM -0700, T.J. Mercier wrote:
> I want to reject mapping a dma_buf for a device if any of the pages
> that back the buffer require SWIOTLB. AFAICT there's no way to know
> whether SWIOTLB is used until after calling dma_map_sg, so afterwards
> I'm trying to check.

You should not check, you simply must handle it by doing the proper
DMA API based ownership management.

2024-05-06 16:40:19

by T.J. Mercier

[permalink] [raw]
Subject: Re: [PATCH] dma-direct: Set SG_DMA_SWIOTLB flag for dma-direct

On Mon, May 6, 2024 at 9:19 AM Christoph Hellwig <[email protected]> wrote:
>
> On Mon, May 06, 2024 at 09:10:40AM -0700, T.J. Mercier wrote:
> > I want to reject mapping a dma_buf for a device if any of the pages
> > that back the buffer require SWIOTLB. AFAICT there's no way to know
> > whether SWIOTLB is used until after calling dma_map_sg, so afterwards
> > I'm trying to check.
>
> You should not check, you simply must handle it by doing the proper
> DMA API based ownership management.

That doesn't really work for uncached buffers. Since the SWIOTLB
bounces are in the sync functions, and avoiding CMO is the point of
uncached buffers, it doesn't make sense to try to map uncached buffers
that would require SWIOTLB. So unless we can get the DMA API to fail
the map in this case (look for DMA_ATTR_SKIP_CPU_SYNC + SWIOTLB?) I'm
not sure how else this should be done.

2024-05-07 05:43:27

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH] dma-direct: Set SG_DMA_SWIOTLB flag for dma-direct

On Mon, May 06, 2024 at 09:39:53AM -0700, T.J. Mercier wrote:
> > You should not check, you simply must handle it by doing the proper
> > DMA API based ownership management.
>
> That doesn't really work for uncached buffers.

What uncached buffers?


2024-05-07 20:09:51

by T.J. Mercier

[permalink] [raw]
Subject: Re: [PATCH] dma-direct: Set SG_DMA_SWIOTLB flag for dma-direct

On Mon, May 6, 2024 at 10:43 PM Christoph Hellwig <[email protected]> wrote:
>
> On Mon, May 06, 2024 at 09:39:53AM -0700, T.J. Mercier wrote:
> > > You should not check, you simply must handle it by doing the proper
> > > DMA API based ownership management.
> >
> > That doesn't really work for uncached buffers.
>
> What uncached buffers?

For example these ones:
https://android.googlesource.com/kernel/common/+/refs/heads/android-mainline/drivers/dma-buf/heaps/system_heap.c#141

Vendors have their own drivers that also export uncached buffers in a
similar way.

2024-05-08 11:34:09

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH] dma-direct: Set SG_DMA_SWIOTLB flag for dma-direct

On Tue, May 07, 2024 at 01:07:25PM -0700, T.J. Mercier wrote:
> On Mon, May 6, 2024 at 10:43???PM Christoph Hellwig <[email protected]> wrote:
> >
> > On Mon, May 06, 2024 at 09:39:53AM -0700, T.J. Mercier wrote:
> > > > You should not check, you simply must handle it by doing the proper
> > > > DMA API based ownership management.
> > >
> > > That doesn't really work for uncached buffers.
> >
> > What uncached buffers?
>
> For example these ones:
> https://android.googlesource.com/kernel/common/+/refs/heads/android-mainline/drivers/dma-buf/heaps/system_heap.c#141

Whatever that code is doing is probably not upstream because it's too
broken to live.


2024-05-08 17:19:50

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH] dma-direct: Set SG_DMA_SWIOTLB flag for dma-direct

On Tue, May 07, 2024 at 01:07:25PM -0700, T.J. Mercier wrote:
> On Mon, May 6, 2024 at 10:43 PM Christoph Hellwig <[email protected]> wrote:
> > On Mon, May 06, 2024 at 09:39:53AM -0700, T.J. Mercier wrote:
> > > > You should not check, you simply must handle it by doing the proper
> > > > DMA API based ownership management.
> > >
> > > That doesn't really work for uncached buffers.
> >
> > What uncached buffers?
>
> For example these ones:
> https://android.googlesource.com/kernel/common/+/refs/heads/android-mainline/drivers/dma-buf/heaps/system_heap.c#141
>
> Vendors have their own drivers that also export uncached buffers in a
> similar way.

IIUC, you have an uncached dma buffer and you want to pass those pages
to dma_map_sgtable() with DMA_ATTR_SKIP_CPU_SYNC since the buffer has
already been made coherent via other means (the CPU mapping is
uncached). The small kmalloc() bouncing gets in the way as it copies the
data to a cached buffer but it also skips the cache maintenance because
of DMA_ATTR_SKIP_CPU_SYNC. I assume Android carries these patches:

https://lore.kernel.org/r/[email protected]/

Arguably this is abusing the DMA API since DMA_ATTR_SKIP_CPU_SYNC should
be used for subsequent mappings of buffers already mapped to device by a
prior dma_map_*() operation. In the above patchset, the buffer is
vmap'ed by the dma-buf heap code and DMA_ATTR_SKIP_CPU_SYNC is used on
the first dma_map_*().

Ignoring the above hacks, I think we can get in a similar situation even
with more complex uses of the DMA API. Let's say some buffers are
initially mapped to device with dma_map_page(), some of them being
bounced but cache maintenance completed. A subsequent dma_map_*()
on those pages may force a bouncing again but DMA_ATTR_SKIP_CPU_SYNC
will avoid the cache maintenance. You are not actually sharing the
original buffers but create separate (bounced) copies no longer coherent
with the device.

I think in general buffer sharing with multiple dma_map_*() calls on the
same buffer and DMA_ATTR_SKIP_CPU_SYNC is incompatible with bouncing,
irrespective of the kmalloc() minalign series. If you do this for a
32-bit device and one of the pages is outside the ZONE_DMA32 range,
you'd get a similar behaviour.

From the kmalloc() minumum alignment perspective, it makes sense to skip
the bouncing if DMA_ATTR_SKIP_CPU_SYNC is passed. We also skip the
bouncing if the direction is DMA_TO_DEVICE or the device is fully
coherent.

A completely untested patch below. It doesn't solve other problems with
bouncing you may have with your out of tree patches and, as Christoph
said, checking in your driver whether the DMA address is a swiotlb
buffer is completely wrong.

---------8<------------------------
diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index e4cb26f6a943..c7ff464a5f81 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -602,15 +602,16 @@ static bool dev_is_untrusted(struct device *dev)
}

static bool dev_use_swiotlb(struct device *dev, size_t size,
- enum dma_data_direction dir)
+ enum dma_data_direction dir, unsigned long attrs)
{
return IS_ENABLED(CONFIG_SWIOTLB) &&
(dev_is_untrusted(dev) ||
- dma_kmalloc_needs_bounce(dev, size, dir));
+ dma_kmalloc_needs_bounce(dev, size, dir, attrs));
}

static bool dev_use_sg_swiotlb(struct device *dev, struct scatterlist *sg,
- int nents, enum dma_data_direction dir)
+ int nents, enum dma_data_direction dir,
+ unsigned long attrs)
{
struct scatterlist *s;
int i;
@@ -626,7 +627,7 @@ static bool dev_use_sg_swiotlb(struct device *dev, struct scatterlist *sg,
* direction, check the individual lengths in the sg list. If any
* element is deemed unsafe, use the swiotlb for bouncing.
*/
- if (!dma_kmalloc_safe(dev, dir)) {
+ if (!dma_kmalloc_safe(dev, dir, attrs)) {
for_each_sg(sg, s, nents, i)
if (!dma_kmalloc_size_aligned(s->length))
return true;
@@ -1076,7 +1077,7 @@ static void iommu_dma_sync_single_for_cpu(struct device *dev,
{
phys_addr_t phys;

- if (dev_is_dma_coherent(dev) && !dev_use_swiotlb(dev, size, dir))
+ if (dev_is_dma_coherent(dev) && !dev_use_swiotlb(dev, size, dir, 0))
return;

phys = iommu_iova_to_phys(iommu_get_dma_domain(dev), dma_handle);
@@ -1092,7 +1093,7 @@ static void iommu_dma_sync_single_for_device(struct device *dev,
{
phys_addr_t phys;

- if (dev_is_dma_coherent(dev) && !dev_use_swiotlb(dev, size, dir))
+ if (dev_is_dma_coherent(dev) && !dev_use_swiotlb(dev, size, dir, 0))
return;

phys = iommu_iova_to_phys(iommu_get_dma_domain(dev), dma_handle);
@@ -1152,7 +1153,7 @@ static dma_addr_t iommu_dma_map_page(struct device *dev, struct page *page,
* If both the physical buffer start address and size are
* page aligned, we don't need to use a bounce page.
*/
- if (dev_use_swiotlb(dev, size, dir) &&
+ if (dev_use_swiotlb(dev, size, dir, attrs) &&
iova_offset(iovad, phys | size)) {
void *padding_start;
size_t padding_size, aligned_size;
@@ -1369,7 +1370,7 @@ static int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg,
goto out;
}

- if (dev_use_sg_swiotlb(dev, sg, nents, dir))
+ if (dev_use_sg_swiotlb(dev, sg, nents, dir, attrs))
return iommu_dma_map_sg_swiotlb(dev, sg, nents, dir, attrs);

if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
diff --git a/include/linux/dma-map-ops.h b/include/linux/dma-map-ops.h
index 4abc60f04209..857a460e40f0 100644
--- a/include/linux/dma-map-ops.h
+++ b/include/linux/dma-map-ops.h
@@ -277,7 +277,8 @@ static inline bool dev_is_dma_coherent(struct device *dev)
* Check whether potential kmalloc() buffers are safe for non-coherent DMA.
*/
static inline bool dma_kmalloc_safe(struct device *dev,
- enum dma_data_direction dir)
+ enum dma_data_direction dir,
+ unsigned long attrs)
{
/*
* If DMA bouncing of kmalloc() buffers is disabled, the kmalloc()
@@ -288,10 +289,12 @@ static inline bool dma_kmalloc_safe(struct device *dev,

/*
* kmalloc() buffers are DMA-safe irrespective of size if the device
- * is coherent or the direction is DMA_TO_DEVICE (non-desctructive
- * cache maintenance and benign cache line evictions).
+ * is coherent, the direction is DMA_TO_DEVICE (non-desctructive
+ * cache maintenance and benign cache line evictions) or the
+ * attributes require no cache maintenance.
*/
- if (dev_is_dma_coherent(dev) || dir == DMA_TO_DEVICE)
+ if (dev_is_dma_coherent(dev) || dir == DMA_TO_DEVICE ||
+ attrs & DMA_ATTR_SKIP_CPU_SYNC)
return true;

return false;
@@ -328,9 +331,11 @@ static inline bool dma_kmalloc_size_aligned(size_t size)
* in the kernel.
*/
static inline bool dma_kmalloc_needs_bounce(struct device *dev, size_t size,
- enum dma_data_direction dir)
+ enum dma_data_direction dir,
+ unsigned long attrs)
{
- return !dma_kmalloc_safe(dev, dir) && !dma_kmalloc_size_aligned(size);
+ return !dma_kmalloc_safe(dev, dir, attrs) &&
+ !dma_kmalloc_size_aligned(size);
}

void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
diff --git a/kernel/dma/direct.h b/kernel/dma/direct.h
index 18d346118fe8..c2d31a67719e 100644
--- a/kernel/dma/direct.h
+++ b/kernel/dma/direct.h
@@ -96,7 +96,7 @@ static inline dma_addr_t dma_direct_map_page(struct device *dev,
}

if (unlikely(!dma_capable(dev, dma_addr, size, true)) ||
- dma_kmalloc_needs_bounce(dev, size, dir)) {
+ dma_kmalloc_needs_bounce(dev, size, dir, attrs)) {
if (is_pci_p2pdma_page(page))
return DMA_MAPPING_ERROR;
if (is_swiotlb_active(dev))

--
Catalin

2024-05-08 20:15:06

by T.J. Mercier

[permalink] [raw]
Subject: Re: [PATCH] dma-direct: Set SG_DMA_SWIOTLB flag for dma-direct

On Wed, May 8, 2024 at 10:19 AM Catalin Marinas <[email protected]> wrote:
>
> On Tue, May 07, 2024 at 01:07:25PM -0700, T.J. Mercier wrote:
> > On Mon, May 6, 2024 at 10:43 PM Christoph Hellwig <[email protected]> wrote:
> > > On Mon, May 06, 2024 at 09:39:53AM -0700, T.J. Mercier wrote:
> > > > > You should not check, you simply must handle it by doing the proper
> > > > > DMA API based ownership management.
> > > >
> > > > That doesn't really work for uncached buffers.
> > >
> > > What uncached buffers?
> >
> > For example these ones:
> > https://android.googlesource.com/kernel/common/+/refs/heads/android-mainline/drivers/dma-buf/heaps/system_heap.c#141
> >
> > Vendors have their own drivers that also export uncached buffers in a
> > similar way.
>
> IIUC, you have an uncached dma buffer and you want to pass those pages
> to dma_map_sgtable() with DMA_ATTR_SKIP_CPU_SYNC since the buffer has
> already been made coherent via other means (the CPU mapping is
> uncached). The small kmalloc() bouncing gets in the way as it copies the
> data to a cached buffer but it also skips the cache maintenance because
> of DMA_ATTR_SKIP_CPU_SYNC.

Yes, that's a problem at the time of the mapping. It's also a problem
for dma_buf_{begin,end}_cpu_access calls because implementing an
uncached buffer means we want to skip dma_sync_sgtable_for_*, but that
also skips the swiotlb copy. The goal is to only let the mapping
succeed if the cache maintenance can always be skipped while also
ensuring all users have a correct view of the memory, and that doesn't
seem possible where swiotlb is involved.

> I assume Android carries these patches:

> https://lore.kernel.org/r/[email protected]/

Correct.

> Arguably this is abusing the DMA API since DMA_ATTR_SKIP_CPU_SYNC should
> be used for subsequent mappings of buffers already mapped to device by a
> prior dma_map_*() operation. In the above patchset, the buffer is
> vmap'ed by the dma-buf heap code and DMA_ATTR_SKIP_CPU_SYNC is used on
> the first dma_map_*().
>
> Ignoring the above hacks, I think we can get in a similar situation even
> with more complex uses of the DMA API. Let's say some buffers are
> initially mapped to device with dma_map_page(), some of them being
> bounced but cache maintenance completed. A subsequent dma_map_*()
> on those pages may force a bouncing again but DMA_ATTR_SKIP_CPU_SYNC
> will avoid the cache maintenance. You are not actually sharing the
> original buffers but create separate (bounced) copies no longer coherent
> with the device.

Right, no good. The inverse of the dma_buf_begin_cpu_access problem.

> I think in general buffer sharing with multiple dma_map_*() calls on the
> same buffer and DMA_ATTR_SKIP_CPU_SYNC is incompatible with bouncing,
> irrespective of the kmalloc() minalign series. If you do this for a
> 32-bit device and one of the pages is outside the ZONE_DMA32 range,
> you'd get a similar behaviour.
>
> From the kmalloc() minumum alignment perspective, it makes sense to skip
> the bouncing if DMA_ATTR_SKIP_CPU_SYNC is passed. We also skip the
> bouncing if the direction is DMA_TO_DEVICE or the device is fully
> coherent.
>
> A completely untested patch below. It doesn't solve other problems with
> bouncing you may have with your out of tree patches and, as Christoph
> said, checking in your driver whether the DMA address is a swiotlb
> buffer is completely wrong.

This is where I must be missing something. Is the main opposition that
the *driver* is checking for swiotlb use (instead of inside the DMA
API)? Because it sounds like we agree it's a bad idea to attempt
bouncing + DMA_ATTR_SKIP_CPU_SYNC.

This code looks like it almost gets there, but it'd still reach
swiotlb_map (instead of DMA_MAPPING_ERROR) with DMA_ATTR_SKIP_CPU_SYNC
set for force_bounce or if the dma_capable check fails.

> ---------8<------------------------
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index e4cb26f6a943..c7ff464a5f81 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -602,15 +602,16 @@ static bool dev_is_untrusted(struct device *dev)
> }
>
> static bool dev_use_swiotlb(struct device *dev, size_t size,
> - enum dma_data_direction dir)
> + enum dma_data_direction dir, unsigned long attrs)
> {
> return IS_ENABLED(CONFIG_SWIOTLB) &&
> (dev_is_untrusted(dev) ||
> - dma_kmalloc_needs_bounce(dev, size, dir));
> + dma_kmalloc_needs_bounce(dev, size, dir, attrs));
> }
>
> static bool dev_use_sg_swiotlb(struct device *dev, struct scatterlist *sg,
> - int nents, enum dma_data_direction dir)
> + int nents, enum dma_data_direction dir,
> + unsigned long attrs)
> {
> struct scatterlist *s;
> int i;
> @@ -626,7 +627,7 @@ static bool dev_use_sg_swiotlb(struct device *dev, struct scatterlist *sg,
> * direction, check the individual lengths in the sg list. If any
> * element is deemed unsafe, use the swiotlb for bouncing.
> */
> - if (!dma_kmalloc_safe(dev, dir)) {
> + if (!dma_kmalloc_safe(dev, dir, attrs)) {
> for_each_sg(sg, s, nents, i)
> if (!dma_kmalloc_size_aligned(s->length))
> return true;
> @@ -1076,7 +1077,7 @@ static void iommu_dma_sync_single_for_cpu(struct device *dev,
> {
> phys_addr_t phys;
>
> - if (dev_is_dma_coherent(dev) && !dev_use_swiotlb(dev, size, dir))
> + if (dev_is_dma_coherent(dev) && !dev_use_swiotlb(dev, size, dir, 0))
> return;
>
> phys = iommu_iova_to_phys(iommu_get_dma_domain(dev), dma_handle);
> @@ -1092,7 +1093,7 @@ static void iommu_dma_sync_single_for_device(struct device *dev,
> {
> phys_addr_t phys;
>
> - if (dev_is_dma_coherent(dev) && !dev_use_swiotlb(dev, size, dir))
> + if (dev_is_dma_coherent(dev) && !dev_use_swiotlb(dev, size, dir, 0))
> return;
>
> phys = iommu_iova_to_phys(iommu_get_dma_domain(dev), dma_handle);
> @@ -1152,7 +1153,7 @@ static dma_addr_t iommu_dma_map_page(struct device *dev, struct page *page,
> * If both the physical buffer start address and size are
> * page aligned, we don't need to use a bounce page.
> */
> - if (dev_use_swiotlb(dev, size, dir) &&
> + if (dev_use_swiotlb(dev, size, dir, attrs) &&
> iova_offset(iovad, phys | size)) {
> void *padding_start;
> size_t padding_size, aligned_size;
> @@ -1369,7 +1370,7 @@ static int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg,
> goto out;
> }
>
> - if (dev_use_sg_swiotlb(dev, sg, nents, dir))
> + if (dev_use_sg_swiotlb(dev, sg, nents, dir, attrs))
> return iommu_dma_map_sg_swiotlb(dev, sg, nents, dir, attrs);
>
> if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
> diff --git a/include/linux/dma-map-ops.h b/include/linux/dma-map-ops.h
> index 4abc60f04209..857a460e40f0 100644
> --- a/include/linux/dma-map-ops.h
> +++ b/include/linux/dma-map-ops.h
> @@ -277,7 +277,8 @@ static inline bool dev_is_dma_coherent(struct device *dev)
> * Check whether potential kmalloc() buffers are safe for non-coherent DMA.
> */
> static inline bool dma_kmalloc_safe(struct device *dev,
> - enum dma_data_direction dir)
> + enum dma_data_direction dir,
> + unsigned long attrs)
> {
> /*
> * If DMA bouncing of kmalloc() buffers is disabled, the kmalloc()
> @@ -288,10 +289,12 @@ static inline bool dma_kmalloc_safe(struct device *dev,
>
> /*
> * kmalloc() buffers are DMA-safe irrespective of size if the device
> - * is coherent or the direction is DMA_TO_DEVICE (non-desctructive
> - * cache maintenance and benign cache line evictions).
> + * is coherent, the direction is DMA_TO_DEVICE (non-desctructive
> + * cache maintenance and benign cache line evictions) or the
> + * attributes require no cache maintenance.
> */
> - if (dev_is_dma_coherent(dev) || dir == DMA_TO_DEVICE)
> + if (dev_is_dma_coherent(dev) || dir == DMA_TO_DEVICE ||
> + attrs & DMA_ATTR_SKIP_CPU_SYNC)
> return true;
>
> return false;
> @@ -328,9 +331,11 @@ static inline bool dma_kmalloc_size_aligned(size_t size)
> * in the kernel.
> */
> static inline bool dma_kmalloc_needs_bounce(struct device *dev, size_t size,
> - enum dma_data_direction dir)
> + enum dma_data_direction dir,
> + unsigned long attrs)
> {
> - return !dma_kmalloc_safe(dev, dir) && !dma_kmalloc_size_aligned(size);
> + return !dma_kmalloc_safe(dev, dir, attrs) &&
> + !dma_kmalloc_size_aligned(size);
> }
>
> void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
> diff --git a/kernel/dma/direct.h b/kernel/dma/direct.h
> index 18d346118fe8..c2d31a67719e 100644
> --- a/kernel/dma/direct.h
> +++ b/kernel/dma/direct.h
> @@ -96,7 +96,7 @@ static inline dma_addr_t dma_direct_map_page(struct device *dev,
> }
>
> if (unlikely(!dma_capable(dev, dma_addr, size, true)) ||
> - dma_kmalloc_needs_bounce(dev, size, dir)) {
> + dma_kmalloc_needs_bounce(dev, size, dir, attrs)) {
> if (is_pci_p2pdma_page(page))
> return DMA_MAPPING_ERROR;
> if (is_swiotlb_active(dev))
>
> --
> Catalin

2024-05-09 07:49:52

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH] dma-direct: Set SG_DMA_SWIOTLB flag for dma-direct

On Wed, May 08, 2024 at 01:14:41PM -0700, T.J. Mercier wrote:
> On Wed, May 8, 2024 at 10:19 AM Catalin Marinas <[email protected]> wrote:
> > On Tue, May 07, 2024 at 01:07:25PM -0700, T.J. Mercier wrote:
> > > On Mon, May 6, 2024 at 10:43 PM Christoph Hellwig <[email protected]> wrote:
> > > > On Mon, May 06, 2024 at 09:39:53AM -0700, T.J. Mercier wrote:
> > > > > > You should not check, you simply must handle it by doing the proper
> > > > > > DMA API based ownership management.
> > > > >
> > > > > That doesn't really work for uncached buffers.
> > > >
> > > > What uncached buffers?
> > >
> > > For example these ones:
> > > https://android.googlesource.com/kernel/common/+/refs/heads/android-mainline/drivers/dma-buf/heaps/system_heap.c#141
> > >
> > > Vendors have their own drivers that also export uncached buffers in a
> > > similar way.
[...]
> > I think in general buffer sharing with multiple dma_map_*() calls on the
> > same buffer and DMA_ATTR_SKIP_CPU_SYNC is incompatible with bouncing,
> > irrespective of the kmalloc() minalign series. If you do this for a
> > 32-bit device and one of the pages is outside the ZONE_DMA32 range,
> > you'd get a similar behaviour.
> >
> > From the kmalloc() minumum alignment perspective, it makes sense to skip
> > the bouncing if DMA_ATTR_SKIP_CPU_SYNC is passed. We also skip the
> > bouncing if the direction is DMA_TO_DEVICE or the device is fully
> > coherent.
> >
> > A completely untested patch below. It doesn't solve other problems with
> > bouncing you may have with your out of tree patches and, as Christoph
> > said, checking in your driver whether the DMA address is a swiotlb
> > buffer is completely wrong.
>
> This is where I must be missing something. Is the main opposition that
> the *driver* is checking for swiotlb use (instead of inside the DMA
> API)?

I see the swiotlb use as some internal detail of the DMA API
implementation that should not leak outside this framework.

> Because it sounds like we agree it's a bad idea to attempt
> bouncing + DMA_ATTR_SKIP_CPU_SYNC.

It's not necessarily the DMA_ATTR_SKIP_CPU_SYNC but rather the usage
model of sharing a buffer between multiple devices. The DMA API is
mostly tailored around CPU <-> single device ownership and the bouncing
works fine. When sharing the same buffer with multiple devices, calling
dma_map_*() on a buffer can potentially create multiple copies of the
original CPU buffer. It may be fine _if_ the devices don't communicate
between themselves using such buffer, otherwise the model is broken
(sync or no sync). The additional issue with DMA_ATTR_SKIP_CPU_SYNC is
when you use it on subsequent dma_map_*() calls assuming that the sync
was already done on the first dma_map_*() call but with bouncing it's
another dma location (ignoring the Android specific patches).

I think we should prevent bouncing if DMA_ATTR_SKIP_CPU_SYNC is passed.
However, this is not sufficient with a proper use of the DMA API since
the first dma_map_*() without this attribute can still do the bouncing.
IMHO what we need is a DMA_ATTR_NO_BOUNCE or DMA_ATTR_SHARED that will
be used on the first map and potentially on subsequent calls in
combination with DMA_ATTR_SKIP_CPU_SYNC (though we could use the latter
to imply "shared"). The downside is that mapping may fail if the
coherent mask is too narrow.

Anyway, the definitive answer should come from the DMA API maintainers.

> This code looks like it almost gets there, but it'd still reach
> swiotlb_map (instead of DMA_MAPPING_ERROR) with DMA_ATTR_SKIP_CPU_SYNC
> set for force_bounce or if the dma_capable check fails.

My quick patch was mainly to ensure the kmalloc() alignment patches do
not make the situation worse.

--
Catalin

2024-05-09 12:55:12

by Robin Murphy

[permalink] [raw]
Subject: Re: [PATCH] dma-direct: Set SG_DMA_SWIOTLB flag for dma-direct

On 08/05/2024 12:33 pm, Christoph Hellwig wrote:
> On Tue, May 07, 2024 at 01:07:25PM -0700, T.J. Mercier wrote:
>> On Mon, May 6, 2024 at 10:43???PM Christoph Hellwig <[email protected]> wrote:
>>>
>>> On Mon, May 06, 2024 at 09:39:53AM -0700, T.J. Mercier wrote:
>>>>> You should not check, you simply must handle it by doing the proper
>>>>> DMA API based ownership management.
>>>>
>>>> That doesn't really work for uncached buffers.
>>>
>>> What uncached buffers?
>>
>> For example these ones:
>> https://android.googlesource.com/kernel/common/+/refs/heads/android-mainline/drivers/dma-buf/heaps/system_heap.c#141
>
> Whatever that code is doing is probably not upstream because it's too
> broken to live.

Indeed, at a glance it appears to be trying to reinvent
dma_alloc_noncontiguous(). What's not immediately obvious is whether
it's particular about allocations being DMA-contiguous; if not then I
think it comes down to the same thing as vb2-dma-sg and the ideas we
were tossing around for that[1].

Thanks,
Robin.

[1]
https://lore.kernel.org/linux-media/[email protected]/

2024-05-09 13:07:12

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH] dma-direct: Set SG_DMA_SWIOTLB flag for dma-direct

On Thu, May 09, 2024 at 08:49:40AM +0100, Catalin Marinas wrote:
> I see the swiotlb use as some internal detail of the DMA API
> implementation that should not leak outside this framework.

And that's what it is.

> I think we should prevent bouncing if DMA_ATTR_SKIP_CPU_SYNC is passed.
> However, this is not sufficient with a proper use of the DMA API since
> the first dma_map_*() without this attribute can still do the bouncing.
> IMHO what we need is a DMA_ATTR_NO_BOUNCE or DMA_ATTR_SHARED that will
> be used on the first map and potentially on subsequent calls in
> combination with DMA_ATTR_SKIP_CPU_SYNC (though we could use the latter
> to imply "shared"). The downside is that mapping may fail if the
> coherent mask is too narrow.

We have two big problems here that kinda interact:

1) DMA_ATTR_SKIP_CPU_SYNC is just a horrible API. It exposes an
implementation detail instead of dealing with use cases.
The original one IIRC was to deal with networking receive
buffers that are often only partially filled and the networking
folks wanted to avoid the overhead for doing the cache operations
for the rest. It kinda works for that but already gets iffy
when swiotlb is involved. The other abuses of the flag just
went downhill form there.

2) the model of dma mapping a single chunk of memory to multiple
devices is not really well accounted for in the DMA API.

So for two we need a memory allocator that can take the constraints
of multiple devices into account, and probably a way to fail a
dma-buf attach when the importer can't address the memory.
We also then need to come up with a memory ownership / cache
maintenance protocol that works for this use case.

2024-05-09 13:37:29

by Robin Murphy

[permalink] [raw]
Subject: Re: [PATCH] dma-direct: Set SG_DMA_SWIOTLB flag for dma-direct

On 04/05/2024 9:53 am, Petr Tesařík wrote:
> On Fri, 3 May 2024 18:37:12 +0000
> "T.J. Mercier" <[email protected]> wrote:
>
>> As of commit 861370f49ce4 ("iommu/dma: force bouncing if the size is
>> not cacheline-aligned") sg_dma_mark_swiotlb is called when
>> dma_map_sgtable takes the IOMMU path and uses SWIOTLB for some portion
>> of a scatterlist. It is never set for the direct path, so drivers
>> cannot always rely on sg_dma_is_swiotlb to return correctly after
>> calling dma_map_sgtable. Fix this by calling sg_dma_mark_swiotlb in the
>> direct path like it is in the IOMMU path.
>>
>> Signed-off-by: T.J. Mercier <[email protected]>
>> ---
>> kernel/dma/direct.c | 4 +++-
>> 1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
>> index 4d543b1e9d57..52f0dcb25ca2 100644
>> --- a/kernel/dma/direct.c
>> +++ b/kernel/dma/direct.c
>> @@ -12,7 +12,7 @@
>> #include <linux/pfn.h>
>> #include <linux/vmalloc.h>
>> #include <linux/set_memory.h>
>> -#include <linux/slab.h>
>> +#include <linux/swiotlb.h>
>> #include "direct.h"
>>
>> /*
>> @@ -497,6 +497,8 @@ int dma_direct_map_sg(struct device *dev, struct scatterlist *sgl, int nents,
>> goto out_unmap;
>> }
>> sg_dma_len(sg) = sg->length;
>> + if (is_swiotlb_buffer(dev, dma_to_phys(dev, sg->dma_address)))
>> + sg_dma_mark_swiotlb(sg);
>> }
>>
>> return nents;
>
> I'm not sure this does the right thing. IIUC when the scatterlist flags
> include SG_DMA_SWIOTLB, iommu_dma_sync_sg_for_*() will call
> iommu_dma_sync_single_for_*(), which in turn translates the DMA address
> to a physical address using iommu_iova_to_phys(). It seems to me that
> this function may not work correctly if there is no IOMMU, but it also
> seems to me that the scatterlist may contain such non-IOMMU addresses.

In principle dma-direct *could* make use of the SG_DMA_SWIOTLB flag for
an ever-so-slightly cheaper check than is_swiotlb_buffer() in sync_sg
and unmap_sg, the same way as iommu-dma does. However the benefit would
be a lot less significant than for iommu-dma, where it's really about
the overhead of needing to perform iommu_iova_to_phys() translations for
every segment every time in order to *get* the right thing to check
is_swiotlb_buffer() on - that's what would be unreasonably prohibitive
otherwise.

Thanks,
Robin

2024-05-09 18:26:42

by T.J. Mercier

[permalink] [raw]
Subject: Re: [PATCH] dma-direct: Set SG_DMA_SWIOTLB flag for dma-direct

On Thu, May 9, 2024 at 5:55 AM Robin Murphy <[email protected]> wrote:
>
> On 08/05/2024 12:33 pm, Christoph Hellwig wrote:
> > On Tue, May 07, 2024 at 01:07:25PM -0700, T.J. Mercier wrote:
> >> On Mon, May 6, 2024 at 10:43???PM Christoph Hellwig <[email protected]> wrote:
> >>>
> >>> On Mon, May 06, 2024 at 09:39:53AM -0700, T.J. Mercier wrote:
> >>>>> You should not check, you simply must handle it by doing the proper
> >>>>> DMA API based ownership management.
> >>>>
> >>>> That doesn't really work for uncached buffers.
> >>>
> >>> What uncached buffers?
> >>
> >> For example these ones:
> >> https://android.googlesource.com/kernel/common/+/refs/heads/android-mainline/drivers/dma-buf/heaps/system_heap.c#141
> >
> > Whatever that code is doing is probably not upstream because it's too
> > broken to live.
>
> Indeed, at a glance it appears to be trying to reinvent
> dma_alloc_noncontiguous(). What's not immediately obvious is whether
> it's particular about allocations being DMA-contiguous; if not then I
> think it comes down to the same thing as vb2-dma-sg and the ideas we
> were tossing around for that[1].

This heap isn't too picky about the memory being contiguous, but there
is an attempt to allocate high order pages if possible to reduce the
number of translation entries. These page orders are one thing vendors
sometimes change for the hardware they have.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/dma-buf/heaps/system_heap.c#n48

I think one problem there'd be attempting to use
dma_alloc_noncontiguous for dmabuf exporters is that memory is
typically (always?) allocated when the buffer is exported, and there
won't be any device attachments at that time.


> Thanks,
> Robin.
>
> [1]
> https://lore.kernel.org/linux-media/[email protected]/

2024-05-09 18:33:19

by T.J. Mercier

[permalink] [raw]
Subject: Re: [PATCH] dma-direct: Set SG_DMA_SWIOTLB flag for dma-direct

On Thu, May 9, 2024 at 6:07 AM Christoph Hellwig <[email protected]> wrote:
>
> On Thu, May 09, 2024 at 08:49:40AM +0100, Catalin Marinas wrote:
> > I see the swiotlb use as some internal detail of the DMA API
> > implementation that should not leak outside this framework.
>
> And that's what it is.
>
> > I think we should prevent bouncing if DMA_ATTR_SKIP_CPU_SYNC is passed.
> > However, this is not sufficient with a proper use of the DMA API since
> > the first dma_map_*() without this attribute can still do the bouncing.
> > IMHO what we need is a DMA_ATTR_NO_BOUNCE or DMA_ATTR_SHARED that will
> > be used on the first map and potentially on subsequent calls in
> > combination with DMA_ATTR_SKIP_CPU_SYNC (though we could use the latter
> > to imply "shared"). The downside is that mapping may fail if the
> > coherent mask is too narrow.
>
> We have two big problems here that kinda interact:
>
> 1) DMA_ATTR_SKIP_CPU_SYNC is just a horrible API. It exposes an
> implementation detail instead of dealing with use cases.
> The original one IIRC was to deal with networking receive
> buffers that are often only partially filled and the networking
> folks wanted to avoid the overhead for doing the cache operations
> for the rest. It kinda works for that but already gets iffy
> when swiotlb is involved. The other abuses of the flag just
> went downhill form there.
>
> 2) the model of dma mapping a single chunk of memory to multiple
> devices is not really well accounted for in the DMA API.
>
> So for two we need a memory allocator that can take the constraints
> of multiple devices into account, and probably a way to fail a
> dma-buf attach when the importer can't address the memory.
> We also then need to come up with a memory ownership / cache
> maintenance protocol that works for this use case.

Being able to fail the attach without necessarily performing any
mapping yet would be an improvement. However I think the original idea
was for dmabuf exporters to perform the constraint solving (if
possible) as attachments get added and then finally allocate however
is best when the buffer is first mapped. But as far as I know there
are no exporters that currently do this. Instead I think the problem
is currently being avoided by using custom exporters for particular
sets of usecases that are known to work on a given system. This
swiotlb + uncached example is one reason we'd want to fail the
constraint solving. The DMA API knows about the swiotlb part but not
really about the uncached part.