2019-06-14 10:22:02

by Christoph Hellwig

[permalink] [raw]
Subject: [RFC] switch m68k to use the generic remapping DMA allocator

Hi Geert and Greg,

can you take a look at the (untested) patches below? They convert m68k
to use the generic remapping DMA allocator, which is also used by
arm64 and csky.


2019-06-14 10:22:07

by Christoph Hellwig

[permalink] [raw]
Subject: [PATCH 1/2] m68k: use the generic dma coherent remap allocator

This switche to using common code for the DMA allocations, including
potential use of the CMA allocator if configure. Also add a
comment where the existing behavior seems to be lacking.

Signed-off-by: Christoph Hellwig <[email protected]>
---
arch/m68k/Kconfig | 2 ++
arch/m68k/kernel/dma.c | 59 ++++++++----------------------------------
2 files changed, 13 insertions(+), 48 deletions(-)

diff --git a/arch/m68k/Kconfig b/arch/m68k/Kconfig
index 218e037ef901..2571a8fba4b0 100644
--- a/arch/m68k/Kconfig
+++ b/arch/m68k/Kconfig
@@ -3,10 +3,12 @@ config M68K
bool
default y
select ARCH_32BIT_OFF_T
+ select ARCH_HAS_DMA_MMAP_PGPROT if MMU && !COLDFIRE
select ARCH_HAS_SYNC_DMA_FOR_DEVICE if HAS_DMA
select ARCH_MIGHT_HAVE_PC_PARPORT if ISA
select ARCH_NO_COHERENT_DMA_MMAP if !MMU
select ARCH_NO_PREEMPT if !COLDFIRE
+ select DMA_DIRECT_REMAP if MMU && !COLDFIRE
select HAVE_IDE
select HAVE_AOUT if MMU
select HAVE_DEBUG_BUGVERBOSE
diff --git a/arch/m68k/kernel/dma.c b/arch/m68k/kernel/dma.c
index b4aa853051bd..9c6a350a16d8 100644
--- a/arch/m68k/kernel/dma.c
+++ b/arch/m68k/kernel/dma.c
@@ -18,57 +18,20 @@
#include <asm/pgalloc.h>

#if defined(CONFIG_MMU) && !defined(CONFIG_COLDFIRE)
-
-void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *handle,
- gfp_t flag, unsigned long attrs)
+pgprot_t arch_dma_mmap_pgprot(struct device *dev, pgprot_t prot,
+ unsigned long attrs)
{
- struct page *page, **map;
- pgprot_t pgprot;
- void *addr;
- int i, order;
-
- pr_debug("dma_alloc_coherent: %d,%x\n", size, flag);
-
- size = PAGE_ALIGN(size);
- order = get_order(size);
-
- page = alloc_pages(flag | __GFP_ZERO, order);
- if (!page)
- return NULL;
-
- *handle = page_to_phys(page);
- map = kmalloc(sizeof(struct page *) << order, flag & ~__GFP_DMA);
- if (!map) {
- __free_pages(page, order);
- return NULL;
+ /*
+ * XXX: this doesn't seem to handle the sun3 MMU at all.
+ */
+ if (CPU_IS_040_OR_060) {
+ pgprot_val(prot) &= ~_PAGE_CACHE040;
+ pgprot_val(prot) |= _PAGE_GLOBAL040 | _PAGE_NOCACHE_S;
+ } else {
+ pgprot_val(prot) |= _PAGE_NOCACHE030;
}
- split_page(page, order);
-
- order = 1 << order;
- size >>= PAGE_SHIFT;
- map[0] = page;
- for (i = 1; i < size; i++)
- map[i] = page + i;
- for (; i < order; i++)
- __free_page(page + i);
- pgprot = __pgprot(_PAGE_PRESENT | _PAGE_ACCESSED | _PAGE_DIRTY);
- if (CPU_IS_040_OR_060)
- pgprot_val(pgprot) |= _PAGE_GLOBAL040 | _PAGE_NOCACHE_S;
- else
- pgprot_val(pgprot) |= _PAGE_NOCACHE030;
- addr = vmap(map, size, VM_MAP, pgprot);
- kfree(map);
-
- return addr;
+ return prot;
}
-
-void arch_dma_free(struct device *dev, size_t size, void *addr,
- dma_addr_t handle, unsigned long attrs)
-{
- pr_debug("dma_free_coherent: %p, %x\n", addr, handle);
- vfree(addr);
-}
-
#else

#include <asm/cacheflush.h>
--
2.20.1

2019-06-14 10:23:03

by Christoph Hellwig

[permalink] [raw]
Subject: [PATCH 2/2] m68k: implement arch_dma_prep_coherent

When we remap memory as non-cached to be used as a DMA coherent buffer
we should writeback all cache and invalidate the cache lines so that
we make sure we have a clean slate. Implement this using the
cache_push() helper.

Signed-off-by: Christoph Hellwig <[email protected]>
---
arch/m68k/Kconfig | 1 +
arch/m68k/kernel/dma.c | 5 +++++
2 files changed, 6 insertions(+)

diff --git a/arch/m68k/Kconfig b/arch/m68k/Kconfig
index 2571a8fba4b0..64b122595896 100644
--- a/arch/m68k/Kconfig
+++ b/arch/m68k/Kconfig
@@ -4,6 +4,7 @@ config M68K
default y
select ARCH_32BIT_OFF_T
select ARCH_HAS_DMA_MMAP_PGPROT if MMU && !COLDFIRE
+ select ARCH_HAS_DMA_PREP_COHERENT
select ARCH_HAS_SYNC_DMA_FOR_DEVICE if HAS_DMA
select ARCH_MIGHT_HAVE_PC_PARPORT if ISA
select ARCH_NO_COHERENT_DMA_MMAP if !MMU
diff --git a/arch/m68k/kernel/dma.c b/arch/m68k/kernel/dma.c
index 9c6a350a16d8..e720e6eed838 100644
--- a/arch/m68k/kernel/dma.c
+++ b/arch/m68k/kernel/dma.c
@@ -18,6 +18,11 @@
#include <asm/pgalloc.h>

#if defined(CONFIG_MMU) && !defined(CONFIG_COLDFIRE)
+void arch_dma_prep_coherent(struct page *page, size_t size)
+{
+ cache_push(page_to_phys(page), size);
+}
+
pgprot_t arch_dma_mmap_pgprot(struct device *dev, pgprot_t prot,
unsigned long attrs)
{
--
2.20.1

2019-06-17 13:39:37

by Greg Ungerer

[permalink] [raw]
Subject: Re: [RFC] switch m68k to use the generic remapping DMA allocator

Hi Christoph,

On 14/6/19 8:21 pm, Christoph Hellwig wrote:
> Hi Geert and Greg,
>
> can you take a look at the (untested) patches below? They convert m68k
> to use the generic remapping DMA allocator, which is also used by
> arm64 and csky.

No impact to ColdFire targets, so I'll have to defer to Geert
for his thoughts on the legacy m68k impact.

Regards
Greg

2019-06-17 18:54:38

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: [RFC] switch m68k to use the generic remapping DMA allocator

Hi Christoph,

On Fri, Jun 14, 2019 at 12:21 PM Christoph Hellwig <[email protected]> wrote:
> can you take a look at the (untested) patches below? They convert m68k
> to use the generic remapping DMA allocator, which is also used by
> arm64 and csky.

Thanks. But what does this buy us?

bloat-o-meter says:

add/remove: 75/0 grow/shrink: 11/6 up/down: 4122/-82 (4040)

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2019-06-24 07:21:04

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: [RFC] switch m68k to use the generic remapping DMA allocator

Hi Christoph,

On Fri, Jun 14, 2019 at 12:21 PM Christoph Hellwig <[email protected]> wrote:
> can you take a look at the (untested) patches below? They convert m68k
> to use the generic remapping DMA allocator, which is also used by
> arm64 and csky.

When building for Sun-3:

kernel/dma/remap.o: In function `arch_dma_alloc':
remap.c:(.text+0x316): undefined reference to `__dma_direct_alloc_pages'
remap.c:(.text+0x32a): undefined reference to `arch_dma_prep_coherent'
remap.c:(.text+0x34e): undefined reference to `arch_dma_mmap_pgprot'
remap.c:(.text+0x378): undefined reference to `__dma_direct_free_pages'
remap.c:(.text+0x3f4): undefined reference to `arch_dma_prep_coherent'
remap.c:(.text+0x40a): undefined reference to `__dma_direct_alloc_pages'
kernel/dma/remap.o: In function `arch_dma_free':
remap.c:(.text+0x446): undefined reference to `__dma_direct_free_pages'
remap.c:(.text+0x4a8): undefined reference to `__dma_direct_free_pages'
kernel/dma/remap.o: In function `dma_atomic_pool_init':
remap.c:(.init.text+0x66): undefined reference to `arch_dma_prep_coherent'

Doing

- select DMA_DIRECT_REMAP if MMU && !COLDFIRE
+ select DMA_DIRECT_REMAP if MMU && !COLDFIRE && !SUN3

in arch/m68k/Kconfig fixes the build.

Alternatively, you could use:

- select DMA_DIRECT_REMAP if MMU && !COLDFIRE
+ select DMA_DIRECT_REMAP if HAS_DMA && MMU && !COLDFIRE

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2019-06-25 07:41:48

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [RFC] switch m68k to use the generic remapping DMA allocator

On Mon, Jun 17, 2019 at 08:53:55PM +0200, Geert Uytterhoeven wrote:
> Hi Christoph,
>
> On Fri, Jun 14, 2019 at 12:21 PM Christoph Hellwig <[email protected]> wrote:
> > can you take a look at the (untested) patches below? They convert m68k
> > to use the generic remapping DMA allocator, which is also used by
> > arm64 and csky.
>
> Thanks. But what does this buy us?

A common dma mapping code base with everyone, including supporting
DMA allocations from atomic context, which the documentation and
API assume are there, but which don't work on m68k.

> bloat-o-meter says:
>
> add/remove: 75/0 grow/shrink: 11/6 up/down: 4122/-82 (4040)

What do these values stand for? The code should grow a little as
we now need to include the the pool allocator for the above API
fix.

2019-06-25 07:41:49

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [RFC] switch m68k to use the generic remapping DMA allocator

On Mon, Jun 24, 2019 at 09:16:30AM +0200, Geert Uytterhoeven wrote:
> Doing
>
> - select DMA_DIRECT_REMAP if MMU && !COLDFIRE
> + select DMA_DIRECT_REMAP if MMU && !COLDFIRE && !SUN3
>
> in arch/m68k/Kconfig fixes the build.
>
> Alternatively, you could use:
>
> - select DMA_DIRECT_REMAP if MMU && !COLDFIRE
> + select DMA_DIRECT_REMAP if HAS_DMA && MMU && !COLDFIRE

The latter might be a little cleaner.

2019-06-25 07:51:39

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: [RFC] switch m68k to use the generic remapping DMA allocator

Hi Christoph,

On Tue, Jun 25, 2019 at 8:33 AM Christoph Hellwig <[email protected]> wrote:
> On Mon, Jun 17, 2019 at 08:53:55PM +0200, Geert Uytterhoeven wrote:
> > On Fri, Jun 14, 2019 at 12:21 PM Christoph Hellwig <[email protected]> wrote:
> > > can you take a look at the (untested) patches below? They convert m68k
> > > to use the generic remapping DMA allocator, which is also used by
> > > arm64 and csky.
> >
> > Thanks. But what does this buy us?
>
> A common dma mapping code base with everyone, including supporting
> DMA allocations from atomic context, which the documentation and
> API assume are there, but which don't work on m68k.

OK, thanks!

> > bloat-o-meter says:
> >
> > add/remove: 75/0 grow/shrink: 11/6 up/down: 4122/-82 (4040)
>
> What do these values stand for? The code should grow a little as
> we now need to include the the pool allocator for the above API
> fix.

Last 3 values are "bytes added/removed (net increase)".
So this increases the static kernel size by ca. 4 KiB.

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2019-06-25 07:53:08

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [RFC] switch m68k to use the generic remapping DMA allocator

On Tue, Jun 25, 2019 at 09:26:48AM +0200, Geert Uytterhoeven wrote:
> > > bloat-o-meter says:
> > >
> > > add/remove: 75/0 grow/shrink: 11/6 up/down: 4122/-82 (4040)
> >
> > What do these values stand for? The code should grow a little as
> > we now need to include the the pool allocator for the above API
> > fix.
>
> Last 3 values are "bytes added/removed (net increase)".
> So this increases the static kernel size by ca. 4 KiB.

That seems a lot for the little bit of pool code. Did m68k not
build lib/genalloc.c by default before?

Also I'd be curious what the first 4 values are.

2019-06-25 08:08:32

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: [RFC] switch m68k to use the generic remapping DMA allocator

Hi Christoph,

On Tue, Jun 25, 2019 at 9:35 AM Christoph Hellwig <[email protected]> wrote:
> On Tue, Jun 25, 2019 at 09:26:48AM +0200, Geert Uytterhoeven wrote:
> > > > bloat-o-meter says:
> > > >
> > > > add/remove: 75/0 grow/shrink: 11/6 up/down: 4122/-82 (4040)
> > >
> > > What do these values stand for? The code should grow a little as
> > > we now need to include the the pool allocator for the above API
> > > fix.
> >
> > Last 3 values are "bytes added/removed (net increase)".
> > So this increases the static kernel size by ca. 4 KiB.
>
> That seems a lot for the little bit of pool code. Did m68k not

Exactly, hence my original question...

> build lib/genalloc.c by default before?

Indeed, CONFIG_GENERIC_ALLOCATOR wasn't enabled before.

--- .config.orig 2019-06-25 09:53:35.098691378 +0200
+++ .config 2019-06-25 09:59:23.914874446 +0200
@@ -2401,6 +2401,7 @@
CONFIG_DECOMPRESS_XZ=y
CONFIG_DECOMPRESS_LZO=y
CONFIG_DECOMPRESS_LZ4=y
+CONFIG_GENERIC_ALLOCATOR=y
CONFIG_TEXTSEARCH=y
CONFIG_TEXTSEARCH_KMP=m
CONFIG_TEXTSEARCH_BM=m
@@ -2409,6 +2410,10 @@
CONFIG_HAS_IOMEM=y
CONFIG_HAS_DMA=y
CONFIG_ARCH_HAS_SYNC_DMA_FOR_DEVICE=y
+CONFIG_ARCH_HAS_DMA_PREP_COHERENT=y
+CONFIG_ARCH_HAS_DMA_MMAP_PGPROT=y
+CONFIG_DMA_REMAP=y
+CONFIG_DMA_DIRECT_REMAP=y
# CONFIG_DMA_API_DEBUG is not set
CONFIG_SGL_ALLOC=y
CONFIG_DQL=y

> Also I'd be curious what the first 4 values are.

Reading scripts/bloat-o-meter in the kernel source tree tells me number of
added/removed symbols (functions or variables), and number of
grown/shrunk symbols.

I run it regularly, to catch silly mistakes (cfr. my favorite one, fixed
by commit 23c323af0375a7f6 ("SUNRPC: No, I did not intend to create a
256KiB hashtable")).

Full output before/after for an atari_defconfig kernel
(numbers are different, baseline has changed, too):

$ bloat-o-meter vmlinux.orig vmlinux
add/remove: 75/0 grow/shrink: 1/2 up/down: 4098/-28 (4070)
Function old new delta
gen_pool_alloc_algo_owner - 392 +392
dma_atomic_pool_init - 360 +360
gen_pool_best_fit - 248 +248
dma_common_contiguous_remap - 238 +238
gen_pool_destroy - 190 +190
gen_pool_free_owner - 184 +184
devm_gen_pool_create - 174 +174
dma_alloc_from_pool - 152 +152
bitmap_clear_ll - 138 +138
arch_dma_free 12 136 +124
dma_common_free_remap - 110 +110
gen_pool_add_owner - 108 +108
gen_pool_fixed_alloc - 100 +100
dma_common_pages_remap - 92 +92
gen_pool_dma_alloc - 78 +78
arch_dma_prep_coherent - 76 +76
clear_bits_ll - 66 +66
dma_free_from_pool - 62 +62
set_bits_ll - 60 +60
devm_gen_pool_match - 56 +56
addr_in_gen_pool - 56 +56
gen_pool_create - 54 +54
gen_pool_virt_to_phys - 52 +52
gen_pool_first_fit_align - 52 +52
gen_pool_for_each_chunk - 42 +42
gen_pool_get - 40 +40
gen_pool_first_fit_order_align - 36 +36
gen_pool_set_algo - 34 +34
early_coherent_pool - 32 +32
__kstrtab_gen_pool_first_fit_order_align - 31 +31
gen_pool_size - 30 +30
arch_dma_mmap_pgprot - 28 +28
dma_in_atomic_pool - 26 +26
__kstrtab_gen_pool_alloc_algo_owner - 26 +26
__kstrtab_gen_pool_first_fit_align - 25 +25
gen_pool_avail - 24 +24
__kstrtab_gen_pool_for_each_chunk - 24 +24
__kstrtab_gen_pool_virt_to_phys - 22 +22
__kstrtab_gen_pool_fixed_alloc - 21 +21
__kstrtab_devm_gen_pool_create - 21 +21
arch_dma_coherent_to_pfn - 20 +20
__kstrtab_gen_pool_free_owner - 20 +20
__kstrtab_gen_pool_first_fit - 19 +19
__kstrtab_gen_pool_dma_alloc - 19 +19
__kstrtab_gen_pool_add_owner - 19 +19
__kstrtab_gen_pool_set_algo - 18 +18
__kstrtab_gen_pool_best_fit - 18 +18
__kstrtab_gen_pool_destroy - 17 +17
__kstrtab_gen_pool_create - 16 +16
__kstrtab_gen_pool_avail - 15 +15
gen_pool_first_fit - 14 +14
devm_gen_pool_release - 14 +14
__setup_str_early_coherent_pool - 14 +14
__kstrtab_gen_pool_size - 14 +14
__kstrtab_gen_pool_get - 13 +13
__setup_early_coherent_pool - 12 +12
__ksymtab_gen_pool_virt_to_phys - 8 +8
__ksymtab_gen_pool_size - 8 +8
__ksymtab_gen_pool_set_algo - 8 +8
__ksymtab_gen_pool_get - 8 +8
__ksymtab_gen_pool_free_owner - 8 +8
__ksymtab_gen_pool_for_each_chunk - 8 +8
__ksymtab_gen_pool_fixed_alloc - 8 +8
__ksymtab_gen_pool_first_fit_order_align - 8 +8
__ksymtab_gen_pool_first_fit_align - 8 +8
__ksymtab_gen_pool_first_fit - 8 +8
__ksymtab_gen_pool_dma_alloc - 8 +8
__ksymtab_gen_pool_destroy - 8 +8
__ksymtab_gen_pool_create - 8 +8
__ksymtab_gen_pool_best_fit - 8 +8
__ksymtab_gen_pool_avail - 8 +8
__ksymtab_gen_pool_alloc_algo_owner - 8 +8
__ksymtab_gen_pool_add_owner - 8 +8
__ksymtab_devm_gen_pool_create - 8 +8
atomic_pool_size - 4 +4
atomic_pool - 4 +4
arch_dma_alloc 312 310 -2
dma_common_mmap 250 224 -26
Total: Before=3724055, After=3728125, chg +0.11%


Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds