2018-12-14 08:26:32

by Christoph Hellwig

[permalink] [raw]
Subject: ensure dma_alloc_coherent always returns zeroed memory

For security reasons we already returned zeroed memory from
dma_alloc_coherent on most common platforms, but some implementation
missed out. Make sure we provide a consistent behavior.


2018-12-14 08:26:35

by Christoph Hellwig

[permalink] [raw]
Subject: [PATCH 2/2] dma-mapping: deprecate dma_zalloc_coherent

We now always return zeroed memory from dma_alloc_coherent. Note that
simply passing GFP_ZERO to dma_alloc_coherent wasn't always doing the
right thing to start with given that various allocators are not backed
by the page allocator and thus would ignore GFP_ZERO.

Signed-off-by: Christoph Hellwig <[email protected]>
---
Documentation/DMA-API.txt | 9 ---------
include/linux/dma-mapping.h | 7 ++++---
2 files changed, 4 insertions(+), 12 deletions(-)

diff --git a/Documentation/DMA-API.txt b/Documentation/DMA-API.txt
index 016eb6909b8a..e133ccd60228 100644
--- a/Documentation/DMA-API.txt
+++ b/Documentation/DMA-API.txt
@@ -58,15 +58,6 @@ specify the ``GFP_`` flags (see kmalloc()) for the allocation (the
implementation may choose to ignore flags that affect the location of
the returned memory, like GFP_DMA).

-::
-
- void *
- dma_zalloc_coherent(struct device *dev, size_t size,
- dma_addr_t *dma_handle, gfp_t flag)
-
-Wraps dma_alloc_coherent() and also zeroes the returned memory if the
-allocation attempt succeeded.
-
::

void
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index f422aec0f53c..a52c6409bdc2 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -644,12 +644,13 @@ static inline unsigned long dma_max_pfn(struct device *dev)
}
#endif

+/*
+ * Please always use dma_alloc_coherent instead as it already zeroes the memory!
+ */
static inline void *dma_zalloc_coherent(struct device *dev, size_t size,
dma_addr_t *dma_handle, gfp_t flag)
{
- void *ret = dma_alloc_coherent(dev, size, dma_handle,
- flag | __GFP_ZERO);
- return ret;
+ return dma_alloc_coherent(dev, size, dma_handle, flag);
}

static inline int dma_get_cache_alignment(void)
--
2.19.2


2018-12-14 08:26:40

by Christoph Hellwig

[permalink] [raw]
Subject: [PATCH 1/2] dma-mapping: zero memory returned from dma_alloc_*

If we want to map memory from the DMA allocator to userspace it must be
zeroed at allocation time to prevent stale data leaks. We already do
this on most common architectures, but some architectures don't do this
yet, fix them up, either by passing GFP_ZERO when we use the normal page
allocator or doing a manual memset otherwise.

Signed-off-by: Christoph Hellwig <[email protected]>
---
arch/alpha/kernel/pci_iommu.c | 2 +-
arch/arc/mm/dma.c | 2 +-
arch/c6x/mm/dma-coherent.c | 5 ++++-
arch/m68k/kernel/dma.c | 2 +-
arch/microblaze/mm/consistent.c | 2 +-
arch/openrisc/kernel/dma.c | 2 +-
arch/parisc/kernel/pci-dma.c | 4 ++--
arch/s390/pci/pci_dma.c | 2 +-
arch/sparc/kernel/ioport.c | 2 +-
arch/sparc/mm/io-unit.c | 2 +-
arch/sparc/mm/iommu.c | 2 +-
arch/xtensa/kernel/pci-dma.c | 2 +-
drivers/misc/mic/host/mic_boot.c | 2 +-
kernel/dma/virt.c | 2 +-
14 files changed, 18 insertions(+), 15 deletions(-)

diff --git a/arch/alpha/kernel/pci_iommu.c b/arch/alpha/kernel/pci_iommu.c
index e1716e0d92fd..28a025eda80d 100644
--- a/arch/alpha/kernel/pci_iommu.c
+++ b/arch/alpha/kernel/pci_iommu.c
@@ -443,7 +443,7 @@ static void *alpha_pci_alloc_coherent(struct device *dev, size_t size,
gfp &= ~GFP_DMA;

try_again:
- cpu_addr = (void *)__get_free_pages(gfp, order);
+ cpu_addr = (void *)__get_free_pages(gfp | GFP_ZERO, order);
if (! cpu_addr) {
printk(KERN_INFO "pci_alloc_consistent: "
"get_free_pages failed from %pf\n",
diff --git a/arch/arc/mm/dma.c b/arch/arc/mm/dma.c
index db203ff69ccf..b0754581efc6 100644
--- a/arch/arc/mm/dma.c
+++ b/arch/arc/mm/dma.c
@@ -33,7 +33,7 @@ void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
*/
BUG_ON(gfp & __GFP_HIGHMEM);

- page = alloc_pages(gfp, order);
+ page = alloc_pages(gfp | GFP_ZERO, order);
if (!page)
return NULL;

diff --git a/arch/c6x/mm/dma-coherent.c b/arch/c6x/mm/dma-coherent.c
index 01305c787201..75b79571732c 100644
--- a/arch/c6x/mm/dma-coherent.c
+++ b/arch/c6x/mm/dma-coherent.c
@@ -78,6 +78,7 @@ static void __free_dma_pages(u32 addr, int order)
void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *handle,
gfp_t gfp, unsigned long attrs)
{
+ void *ret;
u32 paddr;
int order;

@@ -94,7 +95,9 @@ void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *handle,
if (!paddr)
return NULL;

- return phys_to_virt(paddr);
+ ret = phys_to_virt(paddr);
+ memset(ret, 0, 1 << order);
+ return ret;
}

/*
diff --git a/arch/m68k/kernel/dma.c b/arch/m68k/kernel/dma.c
index e99993c57d6b..dafe99d08a6a 100644
--- a/arch/m68k/kernel/dma.c
+++ b/arch/m68k/kernel/dma.c
@@ -32,7 +32,7 @@ void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *handle,
size = PAGE_ALIGN(size);
order = get_order(size);

- page = alloc_pages(flag, order);
+ page = alloc_pages(flag | GFP_ZERO, order);
if (!page)
return NULL;

diff --git a/arch/microblaze/mm/consistent.c b/arch/microblaze/mm/consistent.c
index 45e0a1aa9357..79b9f4695a1b 100644
--- a/arch/microblaze/mm/consistent.c
+++ b/arch/microblaze/mm/consistent.c
@@ -81,7 +81,7 @@ void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
size = PAGE_ALIGN(size);
order = get_order(size);

- vaddr = __get_free_pages(gfp, order);
+ vaddr = __get_free_pages(gfp | GFP_ZERO, order);
if (!vaddr)
return NULL;

diff --git a/arch/openrisc/kernel/dma.c b/arch/openrisc/kernel/dma.c
index 159336adfa2f..cdd03f63207c 100644
--- a/arch/openrisc/kernel/dma.c
+++ b/arch/openrisc/kernel/dma.c
@@ -89,7 +89,7 @@ arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
.mm = &init_mm
};

- page = alloc_pages_exact(size, gfp);
+ page = alloc_pages_exact(size, gfp | GFP_ZERO);
if (!page)
return NULL;

diff --git a/arch/parisc/kernel/pci-dma.c b/arch/parisc/kernel/pci-dma.c
index 04c48f1ef3fb..7fa396714b5a 100644
--- a/arch/parisc/kernel/pci-dma.c
+++ b/arch/parisc/kernel/pci-dma.c
@@ -404,7 +404,7 @@ static void *pcxl_dma_alloc(struct device *dev, size_t size,
order = get_order(size);
size = 1 << (order + PAGE_SHIFT);
vaddr = pcxl_alloc_range(size);
- paddr = __get_free_pages(flag, order);
+ paddr = __get_free_pages(flag | GFP_ZERO, order);
flush_kernel_dcache_range(paddr, size);
paddr = __pa(paddr);
map_uncached_pages(vaddr, size, paddr);
@@ -429,7 +429,7 @@ static void *pcx_dma_alloc(struct device *dev, size_t size,
if ((attrs & DMA_ATTR_NON_CONSISTENT) == 0)
return NULL;

- addr = (void *)__get_free_pages(flag, get_order(size));
+ addr = (void *)__get_free_pages(flag | GFP_ZERO, get_order(size));
if (addr)
*dma_handle = (dma_addr_t)virt_to_phys(addr);

diff --git a/arch/s390/pci/pci_dma.c b/arch/s390/pci/pci_dma.c
index 346ba382193a..2578d9567d86 100644
--- a/arch/s390/pci/pci_dma.c
+++ b/arch/s390/pci/pci_dma.c
@@ -404,7 +404,7 @@ static void *s390_dma_alloc(struct device *dev, size_t size,
dma_addr_t map;

size = PAGE_ALIGN(size);
- page = alloc_pages(flag, get_order(size));
+ page = alloc_pages(flag | GFP_ZERO, get_order(size));
if (!page)
return NULL;

diff --git a/arch/sparc/kernel/ioport.c b/arch/sparc/kernel/ioport.c
index baa235652c27..b3a0c5adeed5 100644
--- a/arch/sparc/kernel/ioport.c
+++ b/arch/sparc/kernel/ioport.c
@@ -325,7 +325,7 @@ void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
return NULL;

size = PAGE_ALIGN(size);
- va = (void *) __get_free_pages(gfp, get_order(size));
+ va = (void *) __get_free_pages(gfp | GFP_ZERO, get_order(size));
if (!va) {
printk("%s: no %zd pages\n", __func__, size >> PAGE_SHIFT);
return NULL;
diff --git a/arch/sparc/mm/io-unit.c b/arch/sparc/mm/io-unit.c
index 2088d292c6e5..d1729c8b0bf9 100644
--- a/arch/sparc/mm/io-unit.c
+++ b/arch/sparc/mm/io-unit.c
@@ -225,7 +225,7 @@ static void *iounit_alloc(struct device *dev, size_t len,
return NULL;

len = PAGE_ALIGN(len);
- va = __get_free_pages(gfp, get_order(len));
+ va = __get_free_pages(gfp | GFP_ZERO, get_order(len));
if (!va)
return NULL;

diff --git a/arch/sparc/mm/iommu.c b/arch/sparc/mm/iommu.c
index 3599485717e7..25c83078ece7 100644
--- a/arch/sparc/mm/iommu.c
+++ b/arch/sparc/mm/iommu.c
@@ -347,7 +347,7 @@ static void *sbus_iommu_alloc(struct device *dev, size_t len,
return NULL;

len = PAGE_ALIGN(len);
- va = __get_free_pages(gfp, get_order(len));
+ va = __get_free_pages(gfp | GFP_ZERO, get_order(len));
if (va == 0)
return NULL;

diff --git a/arch/xtensa/kernel/pci-dma.c b/arch/xtensa/kernel/pci-dma.c
index 1fc138b6bc0a..e9fbec5f6ec2 100644
--- a/arch/xtensa/kernel/pci-dma.c
+++ b/arch/xtensa/kernel/pci-dma.c
@@ -160,7 +160,7 @@ void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *handle,
flag & __GFP_NOWARN);

if (!page)
- page = alloc_pages(flag, get_order(size));
+ page = alloc_pages(flag | GFP_ZERO, get_order(size));

if (!page)
return NULL;
diff --git a/drivers/misc/mic/host/mic_boot.c b/drivers/misc/mic/host/mic_boot.c
index c327985c9523..7e5bd8654fd4 100644
--- a/drivers/misc/mic/host/mic_boot.c
+++ b/drivers/misc/mic/host/mic_boot.c
@@ -149,7 +149,7 @@ static void *__mic_dma_alloc(struct device *dev, size_t size,
struct scif_hw_dev *scdev = dev_get_drvdata(dev);
struct mic_device *mdev = scdev_to_mdev(scdev);
dma_addr_t tmp;
- void *va = kmalloc(size, gfp);
+ void *va = kmalloc(size, gfp | GFP_ZERO);

if (va) {
tmp = mic_map_single(mdev, va, size);
diff --git a/kernel/dma/virt.c b/kernel/dma/virt.c
index 631ddec4b60a..f80990f882a6 100644
--- a/kernel/dma/virt.c
+++ b/kernel/dma/virt.c
@@ -13,7 +13,7 @@ static void *dma_virt_alloc(struct device *dev, size_t size,
{
void *ret;

- ret = (void *)__get_free_pages(gfp, get_order(size));
+ ret = (void *)__get_free_pages(gfp | GFP_ZERO, get_order(size));
if (ret)
*dma_handle = (uintptr_t)ret;
return ret;
--
2.19.2


2018-12-14 09:55:49

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: [PATCH 1/2] dma-mapping: zero memory returned from dma_alloc_*

Hi Christoph,

On Fri, Dec 14, 2018 at 9:26 AM Christoph Hellwig <[email protected]> wrote:
> If we want to map memory from the DMA allocator to userspace it must be
> zeroed at allocation time to prevent stale data leaks. We already do
> this on most common architectures, but some architectures don't do this
> yet, fix them up, either by passing GFP_ZERO when we use the normal page
> allocator or doing a manual memset otherwise.
>
> Signed-off-by: Christoph Hellwig <[email protected]>

Thanks for your patch!

> --- a/arch/m68k/kernel/dma.c
> +++ b/arch/m68k/kernel/dma.c
> @@ -32,7 +32,7 @@ void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *handle,
> size = PAGE_ALIGN(size);
> order = get_order(size);
>
> - page = alloc_pages(flag, order);
> + page = alloc_pages(flag | GFP_ZERO, order);
> if (!page)
> return NULL;

There's second implementation below, which calls __get_free_pages() and
does an explicit memset(). As __get_free_pages() calls alloc_pages(), perhaps
it makes sense to replace the memset() by GFP_ZERO, to increase consistency?

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2018-12-14 09:57:02

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: [PATCH 1/2] dma-mapping: zero memory returned from dma_alloc_*

On Fri, Dec 14, 2018 at 10:54 AM Geert Uytterhoeven
<[email protected]> wrote:
> On Fri, Dec 14, 2018 at 9:26 AM Christoph Hellwig <[email protected]> wrote:
> > If we want to map memory from the DMA allocator to userspace it must be
> > zeroed at allocation time to prevent stale data leaks. We already do
> > this on most common architectures, but some architectures don't do this
> > yet, fix them up, either by passing GFP_ZERO when we use the normal page
> > allocator or doing a manual memset otherwise.
> >
> > Signed-off-by: Christoph Hellwig <[email protected]>
>
> Thanks for your patch!
>
> > --- a/arch/m68k/kernel/dma.c
> > +++ b/arch/m68k/kernel/dma.c
> > @@ -32,7 +32,7 @@ void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *handle,
> > size = PAGE_ALIGN(size);
> > order = get_order(size);
> >
> > - page = alloc_pages(flag, order);
> > + page = alloc_pages(flag | GFP_ZERO, order);
> > if (!page)
> > return NULL;
>
> There's second implementation below, which calls __get_free_pages() and
> does an explicit memset(). As __get_free_pages() calls alloc_pages(), perhaps
> it makes sense to replace the memset() by GFP_ZERO, to increase consistency?

Regardless, for m68k:
Acked-by: Geert Uytterhoeven <[email protected]>

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2018-12-14 11:49:43

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 1/2] dma-mapping: zero memory returned from dma_alloc_*

On Fri, Dec 14, 2018 at 10:54:32AM +0100, Geert Uytterhoeven wrote:
> > - page = alloc_pages(flag, order);
> > + page = alloc_pages(flag | GFP_ZERO, order);
> > if (!page)
> > return NULL;
>
> There's second implementation below, which calls __get_free_pages() and
> does an explicit memset(). As __get_free_pages() calls alloc_pages(), perhaps
> it makes sense to replace the memset() by GFP_ZERO, to increase consistency?

It would, but this patch really tries to be minimally invasive to just
provide the zeroing everywhere. There is plenty of opportunity
to improve the m68k dma allocator if I can get enough reviewers/testers:

- for one the coldfire/nommu case absolutely does not make sense to
me as there is not work done at all to make sure the memory is
mapped uncached despite the architecture implementing cache
flushing for the map interface. So this whole implementation
looks broken to me and will need some major work (I had a previous
discussion with Greg on that which needs to be dug out)
- the "regular" implementation in this patch should probably be replaced
with the generic remapping helpers that have been added for the 4.21
merge window:

http://git.infradead.org/users/hch/dma-mapping.git/commitdiff/0c3b3171ceccb8830c2bb5adff1b4e9b204c1450

Compile tested only patch below:

--
From ade86dc75b9850daf9111ebf9ce15825a6144f2d Mon Sep 17 00:00:00 2001
From: Christoph Hellwig <[email protected]>
Date: Fri, 14 Dec 2018 12:41:45 +0100
Subject: m68k: use the generic dma coherent remap allocator

This switche to using common code for the DMA allocations, including
potential use of the CMA allocator if configure. Also add a few
comments where the existing behavior seems to be lacking.

Signed-off-by: Christoph Hellwig <[email protected]>
---
arch/m68k/Kconfig | 2 ++
arch/m68k/kernel/dma.c | 64 ++++++++++++------------------------------
2 files changed, 20 insertions(+), 46 deletions(-)

diff --git a/arch/m68k/Kconfig b/arch/m68k/Kconfig
index 8a5868e9a3a0..60788cf02fbc 100644
--- a/arch/m68k/Kconfig
+++ b/arch/m68k/Kconfig
@@ -2,10 +2,12 @@
config M68K
bool
default y
+ select ARCH_HAS_DMA_MMAP_PGPROT if MMU && !COLDFIRE
select ARCH_HAS_SYNC_DMA_FOR_DEVICE if HAS_DMA
select ARCH_MIGHT_HAVE_PC_PARPORT if ISA
select ARCH_NO_COHERENT_DMA_MMAP if !MMU
select ARCH_NO_PREEMPT if !COLDFIRE
+ select DMA_DIRECT_REMAP if MMU && !COLDFIRE
select HAVE_IDE
select HAVE_AOUT if MMU
select HAVE_DEBUG_BUGVERBOSE
diff --git a/arch/m68k/kernel/dma.c b/arch/m68k/kernel/dma.c
index dafe99d08a6a..16da5d96e228 100644
--- a/arch/m68k/kernel/dma.c
+++ b/arch/m68k/kernel/dma.c
@@ -18,57 +18,29 @@
#include <asm/pgalloc.h>

#if defined(CONFIG_MMU) && !defined(CONFIG_COLDFIRE)
-
-void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *handle,
- gfp_t flag, unsigned long attrs)
+void arch_dma_prep_coherent(struct page *page, size_t size)
{
- struct page *page, **map;
- pgprot_t pgprot;
- void *addr;
- int i, order;
-
- pr_debug("dma_alloc_coherent: %d,%x\n", size, flag);
-
- size = PAGE_ALIGN(size);
- order = get_order(size);
-
- page = alloc_pages(flag | GFP_ZERO, order);
- if (!page)
- return NULL;
-
- *handle = page_to_phys(page);
- map = kmalloc(sizeof(struct page *) << order, flag & ~__GFP_DMA);
- if (!map) {
- __free_pages(page, order);
- return NULL;
- }
- split_page(page, order);
-
- order = 1 << order;
- size >>= PAGE_SHIFT;
- map[0] = page;
- for (i = 1; i < size; i++)
- map[i] = page + i;
- for (; i < order; i++)
- __free_page(page + i);
- pgprot = __pgprot(_PAGE_PRESENT | _PAGE_ACCESSED | _PAGE_DIRTY);
- if (CPU_IS_040_OR_060)
- pgprot_val(pgprot) |= _PAGE_GLOBAL040 | _PAGE_NOCACHE_S;
- else
- pgprot_val(pgprot) |= _PAGE_NOCACHE030;
- addr = vmap(map, size, VM_MAP, pgprot);
- kfree(map);
-
- return addr;
+ /*
+ * XXX: don't we need to flush and invalidate the caches before
+ * creating a coherent mapping?
+ * coherent?
+ */
}

-void arch_dma_free(struct device *dev, size_t size, void *addr,
- dma_addr_t handle, unsigned long attrs)
+pgprot_t arch_dma_mmap_pgprot(struct device *dev, pgprot_t prot,
+ unsigned long attrs)
{
- pr_debug("dma_free_coherent: %p, %x\n", addr, handle);
- vfree(addr);
+ /*
+ * XXX: this doesn't seem to handle the sun3 MMU at all.
+ */
+ if (CPU_IS_040_OR_060) {
+ pgprot_val(prot) &= ~_PAGE_CACHE040;
+ pgprot_val(prot) |= _PAGE_GLOBAL040 | _PAGE_NOCACHE_S;
+ } else {
+ pgprot_val(prot) |= _PAGE_NOCACHE030;
+ }
+ return prot;
}
-
#else

#include <asm/cacheflush.h>
--
2.19.2


2018-12-14 12:23:03

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 1/2] dma-mapping: zero memory returned from dma_alloc_*

On Fri, Dec 14, 2018 at 12:12:00PM +0000, Eugeniy Paltsev wrote:
> Hi Christoph,
>
> do you have any public git repository with all your dma changes?

Most of the tree show up in my misc.git repo for testing.

This series is here:

http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/dma-alloc-always-zero

There current version of the DMA_ATTR_NON_CONSISTENT series is here:

http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/dma-noncoherent-allocator.2


> I want to test them for ARC.

Thanks a lot!

2018-12-14 12:28:24

by Eugeniy Paltsev

[permalink] [raw]
Subject: Re: [PATCH 1/2] dma-mapping: zero memory returned from dma_alloc_*

Hi Christoph,

do you have any public git repository with all your dma changes?

I want to test them for ARC.

Thanks.

On Fri, 2018-12-14 at 09:25 +0100, Christoph Hellwig wrote:
> If we want to map memory from the DMA allocator to userspace it must be
> zeroed at allocation time to prevent stale data leaks. We already do
> this on most common architectures, but some architectures don't do this
> yet, fix them up, either by passing GFP_ZERO when we use the normal page
> allocator or doing a manual memset otherwise.
>
> Signed-off-by: Christoph Hellwig <[email protected]>
> ---
--
Eugeniy Paltsev

2018-12-14 12:38:03

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: [PATCH 1/2] dma-mapping: zero memory returned from dma_alloc_*

Hi Christoph,

On Fri, Dec 14, 2018 at 12:47 PM Christoph Hellwig <[email protected]> wrote:
>
> On Fri, Dec 14, 2018 at 10:54:32AM +0100, Geert Uytterhoeven wrote:
> > > - page = alloc_pages(flag, order);
> > > + page = alloc_pages(flag | GFP_ZERO, order);
> > > if (!page)
> > > return NULL;
> >
> > There's second implementation below, which calls __get_free_pages() and
> > does an explicit memset(). As __get_free_pages() calls alloc_pages(), perhaps
> > it makes sense to replace the memset() by GFP_ZERO, to increase consistency?
>
> It would, but this patch really tries to be minimally invasive to just
> provide the zeroing everywhere.

Fair enough.

> There is plenty of opportunity
> to improve the m68k dma allocator if I can get enough reviewers/testers:
>
> - for one the coldfire/nommu case absolutely does not make sense to
> me as there is not work done at all to make sure the memory is
> mapped uncached despite the architecture implementing cache
> flushing for the map interface. So this whole implementation
> looks broken to me and will need some major work (I had a previous
> discussion with Greg on that which needs to be dug out)
> - the "regular" implementation in this patch should probably be replaced
> with the generic remapping helpers that have been added for the 4.21
> merge window:
>
> http://git.infradead.org/users/hch/dma-mapping.git/commitdiff/0c3b3171ceccb8830c2bb5adff1b4e9b204c1450
>
> Compile tested only patch below:
>
> --
> From ade86dc75b9850daf9111ebf9ce15825a6144f2d Mon Sep 17 00:00:00 2001
> From: Christoph Hellwig <[email protected]>
> Date: Fri, 14 Dec 2018 12:41:45 +0100
> Subject: m68k: use the generic dma coherent remap allocator
>
> This switche to using common code for the DMA allocations, including
> potential use of the CMA allocator if configure. Also add a few
> comments where the existing behavior seems to be lacking.
>
> Signed-off-by: Christoph Hellwig <[email protected]>

Thanks, looks OK to me.
M68k doesn't have many drivers using the DMA framework, as most of them
predated that framework.

> --- a/arch/m68k/kernel/dma.c
> +++ b/arch/m68k/kernel/dma.c

>
> -void arch_dma_free(struct device *dev, size_t size, void *addr,
> - dma_addr_t handle, unsigned long attrs)
> +pgprot_t arch_dma_mmap_pgprot(struct device *dev, pgprot_t prot,
> + unsigned long attrs)
> {
> - pr_debug("dma_free_coherent: %p, %x\n", addr, handle);
> - vfree(addr);
> + /*
> + * XXX: this doesn't seem to handle the sun3 MMU at all.

Sun-3 selects NO_DMA, and this file is compiled for the HAS_DMA case only.

> + */
> + if (CPU_IS_040_OR_060) {
> + pgprot_val(prot) &= ~_PAGE_CACHE040;
> + pgprot_val(prot) |= _PAGE_GLOBAL040 | _PAGE_NOCACHE_S;
> + } else {
> + pgprot_val(prot) |= _PAGE_NOCACHE030;
> + }
> + return prot;
> }

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2018-12-14 13:34:57

by Christoph Hellwig

[permalink] [raw]
Subject: Re: ensure dma_alloc_coherent always returns zeroed memory

And in various places this used GFP_ZERO instead of __GFP_ZERO,
so won't compile.

The fixed version is available here:

http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/dma-alloc-always-zero


2018-12-14 14:27:41

by Greg Ungerer

[permalink] [raw]
Subject: Re: [PATCH 1/2] dma-mapping: zero memory returned from dma_alloc_*


On 14/12/18 9:47 pm, Christoph Hellwig wrote:
> On Fri, Dec 14, 2018 at 10:54:32AM +0100, Geert Uytterhoeven wrote:
>>> - page = alloc_pages(flag, order);
>>> + page = alloc_pages(flag | GFP_ZERO, order);
>>> if (!page)
>>> return NULL;
>>
>> There's second implementation below, which calls __get_free_pages() and
>> does an explicit memset(). As __get_free_pages() calls alloc_pages(), perhaps
>> it makes sense to replace the memset() by GFP_ZERO, to increase consistency?
>
> It would, but this patch really tries to be minimally invasive to just
> provide the zeroing everywhere. There is plenty of opportunity
> to improve the m68k dma allocator if I can get enough reviewers/testers:
>
> - for one the coldfire/nommu case absolutely does not make sense to
> me as there is not work done at all to make sure the memory is
> mapped uncached despite the architecture implementing cache
> flushing for the map interface. So this whole implementation
> looks broken to me and will need some major work (I had a previous
> discussion with Greg on that which needs to be dug out)

Yep, that is right. Certainly the MMU case is broken. Some noMMU cases work
by virtue of the SoC only having an instruction cache (the older V2 cores).

The MMU case is fixable, but I think it will mean changing away from
the fall-back virtual:physical 1:1 mapping it uses for the kernel address
space. So not completely trivial. Either that or a dedicated area of RAM
for coherent allocations that we can mark as non-cachable via the really
course grained and limited ACR registers - not really very appealing.

The noMMU case in general is probably limited to something like that same
type of dedicated RAM/ACR register mechamism.

The most commonly used periperal with DMA is the FEC ethernet module,
and it has some "special" (used very loosely) cache flushing for
parts like the 532x family which probably makes it mostly work right.
There is a PCI bus on the 54xx family of parts, and I know general
ethernet cards on it (like e1000's) have problems I am sure are
related to the fact that coherent memory allocations aren't.

I do plan to have a look at this for the MMU case some time soon.

Regards
Greg




> - the "regular" implementation in this patch should probably be replaced
> with the generic remapping helpers that have been added for the 4.21
> merge window:
>
> http://git.infradead.org/users/hch/dma-mapping.git/commitdiff/0c3b3171ceccb8830c2bb5adff1b4e9b204c1450
>
> Compile tested only patch below:
>
> --
>>From ade86dc75b9850daf9111ebf9ce15825a6144f2d Mon Sep 17 00:00:00 2001
> From: Christoph Hellwig <[email protected]>
> Date: Fri, 14 Dec 2018 12:41:45 +0100
> Subject: m68k: use the generic dma coherent remap allocator
>
> This switche to using common code for the DMA allocations, including
> potential use of the CMA allocator if configure. Also add a few
> comments where the existing behavior seems to be lacking.
>
> Signed-off-by: Christoph Hellwig <[email protected]>
> ---
> arch/m68k/Kconfig | 2 ++
> arch/m68k/kernel/dma.c | 64 ++++++++++++------------------------------
> 2 files changed, 20 insertions(+), 46 deletions(-)
>
> diff --git a/arch/m68k/Kconfig b/arch/m68k/Kconfig
> index 8a5868e9a3a0..60788cf02fbc 100644
> --- a/arch/m68k/Kconfig
> +++ b/arch/m68k/Kconfig
> @@ -2,10 +2,12 @@
> config M68K
> bool
> default y
> + select ARCH_HAS_DMA_MMAP_PGPROT if MMU && !COLDFIRE
> select ARCH_HAS_SYNC_DMA_FOR_DEVICE if HAS_DMA
> select ARCH_MIGHT_HAVE_PC_PARPORT if ISA
> select ARCH_NO_COHERENT_DMA_MMAP if !MMU
> select ARCH_NO_PREEMPT if !COLDFIRE
> + select DMA_DIRECT_REMAP if MMU && !COLDFIRE
> select HAVE_IDE
> select HAVE_AOUT if MMU
> select HAVE_DEBUG_BUGVERBOSE
> diff --git a/arch/m68k/kernel/dma.c b/arch/m68k/kernel/dma.c
> index dafe99d08a6a..16da5d96e228 100644
> --- a/arch/m68k/kernel/dma.c
> +++ b/arch/m68k/kernel/dma.c
> @@ -18,57 +18,29 @@
> #include <asm/pgalloc.h>
>
> #if defined(CONFIG_MMU) && !defined(CONFIG_COLDFIRE)
> -
> -void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *handle,
> - gfp_t flag, unsigned long attrs)
> +void arch_dma_prep_coherent(struct page *page, size_t size)
> {
> - struct page *page, **map;
> - pgprot_t pgprot;
> - void *addr;
> - int i, order;
> -
> - pr_debug("dma_alloc_coherent: %d,%x\n", size, flag);
> -
> - size = PAGE_ALIGN(size);
> - order = get_order(size);
> -
> - page = alloc_pages(flag | GFP_ZERO, order);
> - if (!page)
> - return NULL;
> -
> - *handle = page_to_phys(page);
> - map = kmalloc(sizeof(struct page *) << order, flag & ~__GFP_DMA);
> - if (!map) {
> - __free_pages(page, order);
> - return NULL;
> - }
> - split_page(page, order);
> -
> - order = 1 << order;
> - size >>= PAGE_SHIFT;
> - map[0] = page;
> - for (i = 1; i < size; i++)
> - map[i] = page + i;
> - for (; i < order; i++)
> - __free_page(page + i);
> - pgprot = __pgprot(_PAGE_PRESENT | _PAGE_ACCESSED | _PAGE_DIRTY);
> - if (CPU_IS_040_OR_060)
> - pgprot_val(pgprot) |= _PAGE_GLOBAL040 | _PAGE_NOCACHE_S;
> - else
> - pgprot_val(pgprot) |= _PAGE_NOCACHE030;
> - addr = vmap(map, size, VM_MAP, pgprot);
> - kfree(map);
> -
> - return addr;
> + /*
> + * XXX: don't we need to flush and invalidate the caches before
> + * creating a coherent mapping?
> + * coherent?
> + */
> }
>
> -void arch_dma_free(struct device *dev, size_t size, void *addr,
> - dma_addr_t handle, unsigned long attrs)
> +pgprot_t arch_dma_mmap_pgprot(struct device *dev, pgprot_t prot,
> + unsigned long attrs)
> {
> - pr_debug("dma_free_coherent: %p, %x\n", addr, handle);
> - vfree(addr);
> + /*
> + * XXX: this doesn't seem to handle the sun3 MMU at all.
> + */
> + if (CPU_IS_040_OR_060) {
> + pgprot_val(prot) &= ~_PAGE_CACHE040;
> + pgprot_val(prot) |= _PAGE_GLOBAL040 | _PAGE_NOCACHE_S;
> + } else {
> + pgprot_val(prot) |= _PAGE_NOCACHE030;
> + }
> + return prot;
> }
> -
> #else
>
> #include <asm/cacheflush.h>
>

2018-12-14 18:12:37

by Sam Ravnborg

[permalink] [raw]
Subject: Re: [PATCH 1/2] dma-mapping: zero memory returned from dma_alloc_*

Hi Christoph,

I stumbled upon this one:

#define __get_dma_pages(gfp_mask, order) \
__get_free_pages((gfp_mask) | GFP_DMA, (order))

(include/linux/gfp.h)
Should it also have the __GFP_ZERO treatment?
Or maybe this is already done in your tree..

As for the sparc bits:
Acked-by: Sam Ravnborg <[email protected]> [sparc]

Sam

2018-12-14 18:37:57

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 1/2] dma-mapping: zero memory returned from dma_alloc_*

On Fri, Dec 14, 2018 at 07:10:56PM +0100, Sam Ravnborg wrote:
> Hi Christoph,
>
> I stumbled upon this one:
>
> #define __get_dma_pages(gfp_mask, order) \
> __get_free_pages((gfp_mask) | GFP_DMA, (order))

This isn't directly related to the dma mapping, but another place
that hides GFP_DMA allocations. So no need for the treatment,
but we really should kill this obsfucating wrapper..

2018-12-17 12:01:04

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 1/2] dma-mapping: zero memory returned from dma_alloc_*

On Sat, Dec 15, 2018 at 12:14:29AM +1000, Greg Ungerer wrote:
> Yep, that is right. Certainly the MMU case is broken. Some noMMU cases work
> by virtue of the SoC only having an instruction cache (the older V2 cores).

Is there a good an easy case to detect if a core has a cache? Either
runtime or in Kconfig?

> The MMU case is fixable, but I think it will mean changing away from
> the fall-back virtual:physical 1:1 mapping it uses for the kernel address
> space. So not completely trivial. Either that or a dedicated area of RAM
> for coherent allocations that we can mark as non-cachable via the really
> course grained and limited ACR registers - not really very appealing.

What about CF_PAGE_NOCACHE? Reading arch/m68k/include/asm/mcf_pgtable.h
suggest this would cause an uncached mapping, in which case something
like this should work:

http://git.infradead.org/users/hch/misc.git/commitdiff/4b8711d436e8d56edbc5ca19aa2be639705bbfef

> The noMMU case in general is probably limited to something like that same
> type of dedicated RAM/ACR register mechamism.
>
> The most commonly used periperal with DMA is the FEC ethernet module,
> and it has some "special" (used very loosely) cache flushing for
> parts like the 532x family which probably makes it mostly work right.
> There is a PCI bus on the 54xx family of parts, and I know general
> ethernet cards on it (like e1000's) have problems I am sure are
> related to the fact that coherent memory allocations aren't.

If we really just care about FEC we can just switch it do use
DMA_ATTR_NON_CONSISTENT and do explicit cache flushing. But as far
as I can tell FEC only uses DMA coherent allocations for the TSO
headers anyway, is TSO even used on this SOC?

2018-12-19 21:28:31

by Christoph Hellwig

[permalink] [raw]
Subject: Re: ensure dma_alloc_coherent always returns zeroed memory

FYI, I've picked this up for dma-mapping for-next now.

2018-12-20 16:17:17

by Eugeniy Paltsev

[permalink] [raw]
Subject: Re: ensure dma_alloc_coherent always returns zeroed memory

Hi Christoph,

I test kernel from your 'dma-alloc-always-zero' branch, and as
I can see we have DMA peripherals (like USB) broken.

There are the errors example I got during USB initializing:
------------------------------>8--------------------------------
usb 1-1: device descriptor read/64, error -110
usb usb1-port1: attempt power cycle
usb 1-1: new high-speed USB device number 4 using ehci-platform
usb 1-1: device descriptor read/64, error -110
usb 1-1: device descriptor read/64, error -110
usb 1-1: new high-speed USB device number 5 using ehci-platform
usb 1-1: device descriptor read/64, error -110
usb 1-1: device descriptor read/64, error -110
usb usb1-port1: unable to enumerate USB device
usb 2-1: new full-speed USB device number 2 using ohci-platform
usb 2-1: device descriptor read/all, error -84
usb 2-1: new full-speed USB device number 3 using ohci-platform
usb 2-1: device descriptor read/all, error -84
usb usb2-port1: attempt power cycle
usb 2-1: new full-speed USB device number 4 using ohci-platform
usb 2-1: device descriptor read/8, error -62
usb 2-1: device descriptor read/8, error -62
usb 2-1: new full-speed USB device number 5 using ohci-platform
usb 2-1: device descriptor read/8, error -84
usb 2-1: device descriptor read/8, error -84
usb usb2-port1: unable to enumerate USB device
[snip]
------------------------------8<--------------------------------


On Wed, 2018-12-19 at 17:59 +0100, Christoph Hellwig wrote:
> FYI, I've picked this up for dma-mapping for-next now.
>
> _______________________________________________
> linux-snps-arc mailing list
> [email protected]
> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.infradead.org_mailman_listinfo_linux-2Dsnps-2Darc&d=DwICAg&c=DPL6_X_6JkXFx7AXWqB0tg&r=ZlJN
> 1MriPUTkBKCrPSx67GmaplEUGcAEk9yPtCLdUXI&m=naBCT96A4RSQJLzBWzuCcmptFGiQDfFTOWJpprqDIVk&s=cHF84va89ofP6VlrV683ewENUXdaLW7opamLZSkZBgk&e=
--
Eugeniy Paltsev

2018-12-20 16:18:11

by Christoph Hellwig

[permalink] [raw]
Subject: Re: ensure dma_alloc_coherent always returns zeroed memory

On Thu, Dec 20, 2018 at 02:32:52PM +0000, Eugeniy Paltsev wrote:
> Hi Christoph,
>
> I test kernel from your 'dma-alloc-always-zero' branch, and as
> I can see we have DMA peripherals (like USB) broken.

I would be really surprised if that is caused by the patch to add
the zeroing. Can you check which commit caused the issue by bisecting
from a known good baseline?

2018-12-20 16:20:43

by Eugeniy Paltsev

[permalink] [raw]
Subject: Re: ensure dma_alloc_coherent always returns zeroed memory

On Thu, 2018-12-20 at 15:34 +0100, [email protected] wrote:
> On Thu, Dec 20, 2018 at 02:32:52PM +0000, Eugeniy Paltsev wrote:
> > Hi Christoph,
> >
> > I test kernel from your 'dma-alloc-always-zero' branch, and as
> > I can see we have DMA peripherals (like USB) broken.
>
> I would be really surprised if that is caused by the patch to add
> the zeroing.
Me too :)

> Can you check which commit caused the issue by bisecting
> from a known good baseline?

Yep. At least kernel build from v4.20-rc4 (which is base for your branch)
seems to work fine.

--
Eugeniy Paltsev

2018-12-20 16:23:49

by Christoph Hellwig

[permalink] [raw]
Subject: Re: ensure dma_alloc_coherent always returns zeroed memory

On Thu, Dec 20, 2018 at 02:39:20PM +0000, Eugeniy Paltsev wrote:
> > I would be really surprised if that is caused by the patch to add
> > the zeroing.
> Me too :)
>
> > Can you check which commit caused the issue by bisecting
> > from a known good baseline?
>
> Yep. At least kernel build from v4.20-rc4 (which is base for your branch)
> seems to work fine.

Well, the branch has quite a few commits. Can you check the commit
just before the zeroing to start, and if that is already broken (which
I suspect) bisect the offending commit?

2018-12-21 01:19:54

by Christoph Hellwig

[permalink] [raw]
Subject: Re: ensure dma_alloc_coherent always returns zeroed memory

Btw, can you try wit the very latests dma-mapping-for-next tree
which has a new fix from Thierry Reding that might be related.

2019-01-11 08:23:59

by Greg Ungerer

[permalink] [raw]
Subject: Re: [PATCH 1/2] dma-mapping: zero memory returned from dma_alloc_*

Hi Christoph,

On 17/12/18 9:59 pm, Christoph Hellwig wrote:
> On Sat, Dec 15, 2018 at 12:14:29AM +1000, Greg Ungerer wrote:
>> Yep, that is right. Certainly the MMU case is broken. Some noMMU cases work
>> by virtue of the SoC only having an instruction cache (the older V2 cores).
>
> Is there a good an easy case to detect if a core has a cache? Either
> runtime or in Kconfig?
>
>> The MMU case is fixable, but I think it will mean changing away from
>> the fall-back virtual:physical 1:1 mapping it uses for the kernel address
>> space. So not completely trivial. Either that or a dedicated area of RAM
>> for coherent allocations that we can mark as non-cachable via the really
>> course grained and limited ACR registers - not really very appealing.
>
> What about CF_PAGE_NOCACHE? Reading arch/m68k/include/asm/mcf_pgtable.h
> suggest this would cause an uncached mapping, in which case something
> like this should work:
>
> http://git.infradead.org/users/hch/misc.git/commitdiff/4b8711d436e8d56edbc5ca19aa2be639705bbfef

No, that won't work.

The current MMU setup for ColdFire relies on a quirk of the cache
control subsystem to map kernel mapping (actually all of RAM when
accessed in supervisor mode).

The effective address calculation by the CPU/MMU firstly checks
the RAMBAR access, then

From the ColdFire 5475 Reference Manual (section 5.5.1):

If virtual mode is enabled, any normal mode access that does not hit in the MMUBAR,
RAMBARs, ROMBARs, or ACRs is considered a normal mode virtual address request and
generates its access attributes from the MMU. For this case, the default CACR address attributes
are not used.

The MMUBAR is the MMU control registers, the RAMBAR/ROMBAR are the
internal static RAM/ROM regions and the ACR are the cache control
registers. The code in arch/m68k/coldfire/head.S sets up the ACR
registers so that all of RAM is accessible and cached when in supervisor
mode. So kernel code and data accesses will hit this and use the
address for access. User pages won't hit this and will go through to
hit the MMU mappings.

The net out is we don't need page mappings or use TLB entries
for kernel code/data. The problem is we also can't map individual
regions as not cached for coherent allocations... The ACR mapping
means all-or-nothing.

This leads back to what I mentioned earlier about changing the
VM mapping to not use the ACR mapping method and actually page
mapping the kernel space. Not completely trivial and I expect
there will be a performance hit with the extra TLB pressure and
their setup/remapping overhead.


>> The noMMU case in general is probably limited to something like that same
>> type of dedicated RAM/ACR register mechamism.
>>
>> The most commonly used periperal with DMA is the FEC ethernet module,
>> and it has some "special" (used very loosely) cache flushing for
>> parts like the 532x family which probably makes it mostly work right.
>> There is a PCI bus on the 54xx family of parts, and I know general
>> ethernet cards on it (like e1000's) have problems I am sure are
>> related to the fact that coherent memory allocations aren't.
>
> If we really just care about FEC we can just switch it do use
> DMA_ATTR_NON_CONSISTENT and do explicit cache flushing. But as far
> as I can tell FEC only uses DMA coherent allocations for the TSO
> headers anyway, is TSO even used on this SOC?

The FEC is the most commonly used, but not the only. I test generic PCI
NICs on the PCI bus on the ColdFire 5475 - and a lot of those drivers
rely on coherent allocations.

Regards
Greg