2015-07-15 21:17:04

by Sean O. Stalley

[permalink] [raw]
Subject: [PATCH 0/4] mm: add dma_pool_zalloc() & pci_pool_zalloc()

Currently a call to dma_pool_alloc() with a ___GFP_ZERO flag returns
a non-zeroed memory region.

This patchset adds support for the ___GFP_ZERO flag to dma_pool_alloc(),
adds 2 wrapper functions for allocing zeroed memory from a pool,
and provides a coccinelle script for finding & replacing instances of
dma_pool_alloc() followed by memset(0) with a single dma_pool_zalloc() call.

Sean O. Stalley (4):
mm: Add support for __GFP_ZERO flag to dma_pool_alloc()
mm: Add dma_pool_zalloc() call to DMA API
pci: mm: Add pci_pool_zalloc() call
coccinelle: mm: scripts/coccinelle/api/alloc/pool_zalloc-simple.cocci

Documentation/DMA-API.txt | 7 ++
include/linux/dmapool.h | 6 ++
include/linux/pci.h | 2 +
mm/dmapool.c | 6 +-
.../coccinelle/api/alloc/pool_zalloc-simple.cocci | 84 ++++++++++++++++++++++
5 files changed, 104 insertions(+), 1 deletion(-)
create mode 100644 scripts/coccinelle/api/alloc/pool_zalloc-simple.cocci

--
1.9.1


2015-07-15 21:17:03

by Sean O. Stalley

[permalink] [raw]
Subject: [PATCH 1/4] mm: Add support for __GFP_ZERO flag to dma_pool_alloc()

Currently the __GFP_ZERO flag is ignored by dma_pool_alloc().
Make dma_pool_alloc() zero the memory if this flag is set.

Signed-off-by: Sean O. Stalley <[email protected]>
---
mm/dmapool.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/mm/dmapool.c b/mm/dmapool.c
index fd5fe43..449a5d09 100644
--- a/mm/dmapool.c
+++ b/mm/dmapool.c
@@ -334,7 +334,7 @@ void *dma_pool_alloc(struct dma_pool *pool, gfp_t mem_flags,
/* pool_alloc_page() might sleep, so temporarily drop &pool->lock */
spin_unlock_irqrestore(&pool->lock, flags);

- page = pool_alloc_page(pool, mem_flags);
+ page = pool_alloc_page(pool, mem_flags & (~__GFP_ZERO));
if (!page)
return NULL;

@@ -375,6 +375,10 @@ void *dma_pool_alloc(struct dma_pool *pool, gfp_t mem_flags,
memset(retval, POOL_POISON_ALLOCATED, pool->size);
#endif
spin_unlock_irqrestore(&pool->lock, flags);
+
+ if (mem_flags & __GFP_ZERO)
+ memset(retval, 0, pool->size);
+
return retval;
}
EXPORT_SYMBOL(dma_pool_alloc);
--
1.9.1

2015-07-15 21:18:06

by Sean O. Stalley

[permalink] [raw]
Subject: [PATCH 2/4] mm: Add dma_pool_zalloc() call to DMA API

Add a wrapper function for dma_pool_alloc() to get zeroed memory.

Signed-off-by: Sean O. Stalley <[email protected]>
---
Documentation/DMA-API.txt | 7 +++++++
include/linux/dmapool.h | 6 ++++++
2 files changed, 13 insertions(+)

diff --git a/Documentation/DMA-API.txt b/Documentation/DMA-API.txt
index 5208840..988f757 100644
--- a/Documentation/DMA-API.txt
+++ b/Documentation/DMA-API.txt
@@ -104,6 +104,13 @@ crossing restrictions, pass 0 for alloc; passing 4096 says memory allocated
from this pool must not cross 4KByte boundaries.


+ void *dma_pool_zalloc(struct dma_pool *pool, gfp_t mem_flags,
+ dma_addr_t *handle)
+
+Wraps dma_pool_alloc() and also zeroes the returned memory if the
+allocation attempt succeeded.
+
+
void *dma_pool_alloc(struct dma_pool *pool, gfp_t gfp_flags,
dma_addr_t *dma_handle);

diff --git a/include/linux/dmapool.h b/include/linux/dmapool.h
index 022e34f..6d8079b 100644
--- a/include/linux/dmapool.h
+++ b/include/linux/dmapool.h
@@ -22,6 +22,12 @@ void dma_pool_destroy(struct dma_pool *pool);
void *dma_pool_alloc(struct dma_pool *pool, gfp_t mem_flags,
dma_addr_t *handle);

+static inline void *dma_pool_zalloc(struct dma_pool *pool, gfp_t mem_flags,
+ dma_addr_t *handle)
+{
+ return dma_pool_alloc(pool, mem_flags | __GFP_ZERO, handle);
+}
+
void dma_pool_free(struct dma_pool *pool, void *vaddr, dma_addr_t addr);

/*
--
1.9.1

2015-07-15 21:17:49

by Sean O. Stalley

[permalink] [raw]
Subject: [PATCH 3/4] pci: mm: Add pci_pool_zalloc() call

Add a wrapper function for pci_pool_alloc() to get zeroed memory.

Signed-off-by: Sean O. Stalley <[email protected]>
---
include/linux/pci.h | 2 ++
1 file changed, 2 insertions(+)

diff --git a/include/linux/pci.h b/include/linux/pci.h
index 755a2cd..e6ec7d9 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1176,6 +1176,8 @@ int pci_set_vga_state(struct pci_dev *pdev, bool decode,
dma_pool_create(name, &pdev->dev, size, align, allocation)
#define pci_pool_destroy(pool) dma_pool_destroy(pool)
#define pci_pool_alloc(pool, flags, handle) dma_pool_alloc(pool, flags, handle)
+#define pci_pool_zalloc(pool, flags, handle) \
+ dma_pool_zalloc(pool, flags, handle)
#define pci_pool_free(pool, vaddr, addr) dma_pool_free(pool, vaddr, addr)

enum pci_dma_burst_strategy {
--
1.9.1

2015-07-15 21:17:05

by Sean O. Stalley

[permalink] [raw]
Subject: [PATCH 4/4] coccinelle: mm: scripts/coccinelle/api/alloc/pool_zalloc-simple.cocci

add [pci|dma]_pool_zalloc coccinelle check.
replaces instances of [pci|dma]_pool_alloc() followed by memset(0)
with [pci|dma]_pool_zalloc().

Signed-off-by: Sean O. Stalley <[email protected]>
---
.../coccinelle/api/alloc/pool_zalloc-simple.cocci | 84 ++++++++++++++++++++++
1 file changed, 84 insertions(+)
create mode 100644 scripts/coccinelle/api/alloc/pool_zalloc-simple.cocci

diff --git a/scripts/coccinelle/api/alloc/pool_zalloc-simple.cocci b/scripts/coccinelle/api/alloc/pool_zalloc-simple.cocci
new file mode 100644
index 0000000..9b7eb32
--- /dev/null
+++ b/scripts/coccinelle/api/alloc/pool_zalloc-simple.cocci
@@ -0,0 +1,84 @@
+///
+/// Use *_pool_zalloc rather than *_pool_alloc followed by memset with 0
+///
+// Copyright: (C) 2015 Intel Corp. GPLv2.
+// Options: --no-includes --include-headers
+//
+// Keywords: dma_pool_zalloc, pci_pool_zalloc
+//
+
+virtual context
+virtual patch
+virtual org
+virtual report
+
+//----------------------------------------------------------
+// For context mode
+//----------------------------------------------------------
+
+@depends on context@
+expression x;
+statement S;
+@@
+
+* x = \(dma_pool_alloc\|pci_pool_alloc\)(...);
+ if ((x==NULL) || ...) S
+* memset(x,0, ...);
+
+//----------------------------------------------------------
+// For patch mode
+//----------------------------------------------------------
+
+@depends on patch@
+expression x;
+expression a,b,c;
+statement S;
+@@
+
+- x = dma_pool_alloc(a,b,c);
++ x = dma_pool_zalloc(a,b,c);
+ if ((x==NULL) || ...) S
+- memset(x,0,...);
+
+@depends on patch@
+expression x;
+expression a,b,c;
+statement S;
+@@
+
+- x = pci_pool_alloc(a,b,c);
++ x = pci_pool_zalloc(a,b,c);
+ if ((x==NULL) || ...) S
+- memset(x,0,...);
+
+//----------------------------------------------------------
+// For org and report mode
+//----------------------------------------------------------
+
+@r depends on org || report@
+expression x;
+expression a,b,c;
+statement S;
+position p;
+@@
+
+ x = @p\(dma_pool_alloc\|pci_pool_alloc\)(a,b,c);
+ if ((x==NULL) || ...) S
+ memset(x,0, ...);
+
+@script:python depends on org@
+p << r.p;
+x << r.x;
+@@
+
+msg="%s" % (x)
+msg_safe=msg.replace("[","@(").replace("]",")")
+coccilib.org.print_todo(p[0], msg_safe)
+
+@script:python depends on report@
+p << r.p;
+x << r.x;
+@@
+
+msg="WARNING: *_pool_zalloc should be used for %s, instead of *_pool_alloc/memset" % (x)
+coccilib.report.print_report(p[0], msg)
--
1.9.1

2015-07-15 21:29:10

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH 1/4] mm: Add support for __GFP_ZERO flag to dma_pool_alloc()

On Wed, 15 Jul 2015 14:14:40 -0700 "Sean O. Stalley" <[email protected]> wrote:

> Currently the __GFP_ZERO flag is ignored by dma_pool_alloc().
> Make dma_pool_alloc() zero the memory if this flag is set.
>
> ...
>
> --- a/mm/dmapool.c
> +++ b/mm/dmapool.c
> @@ -334,7 +334,7 @@ void *dma_pool_alloc(struct dma_pool *pool, gfp_t mem_flags,
> /* pool_alloc_page() might sleep, so temporarily drop &pool->lock */
> spin_unlock_irqrestore(&pool->lock, flags);
>
> - page = pool_alloc_page(pool, mem_flags);
> + page = pool_alloc_page(pool, mem_flags & (~__GFP_ZERO));
> if (!page)
> return NULL;
>
> @@ -375,6 +375,10 @@ void *dma_pool_alloc(struct dma_pool *pool, gfp_t mem_flags,
> memset(retval, POOL_POISON_ALLOCATED, pool->size);
> #endif
> spin_unlock_irqrestore(&pool->lock, flags);
> +
> + if (mem_flags & __GFP_ZERO)
> + memset(retval, 0, pool->size);
> +
> return retval;
> }
> EXPORT_SYMBOL(dma_pool_alloc);

hm, this code is all a bit confused.

We'd really prefer that the __GFP_ZERO be passed all the way to the
bottom level, so that places which are responsible for zeroing memory
(eg, the page allocator) can do their designated function. One reason
for this is that if someone comes up with a whizzy way of zeroing
memory on their architecture (eg, non-temporal store) then that will be
implemented in the core page allocator and the dma code will miss out.

Also, and just from a brief look around,
drivers/base/dma-coherent.c:dma_alloc_from_coherent() is already
zeroing the memory so under some circumstances I think we'll zero the
memory twice? We could fix that by passing the gfp_t to
dma_alloc_from_coherent() and then changing dma_alloc_from_coherent()
to *not* zero the memory if __GFP_ZERO, but wouldn't that be peculiar?

Also, passing __GFP_ZERO will now cause pool_alloc_page()'s
memset(POOL_POISON_FREED) to be wiped out. I guess that's harmless,
but a bit inefficient?

2015-07-15 23:15:57

by Sean O. Stalley

[permalink] [raw]
Subject: Re: [PATCH 1/4] mm: Add support for __GFP_ZERO flag to dma_pool_alloc()

Thanks for the review Andrew, my responses are inline.

-Sean

On Wed, Jul 15, 2015 at 02:29:07PM -0700, Andrew Morton wrote:
> On Wed, 15 Jul 2015 14:14:40 -0700 "Sean O. Stalley" <[email protected]> wrote:
>
> > Currently the __GFP_ZERO flag is ignored by dma_pool_alloc().
> > Make dma_pool_alloc() zero the memory if this flag is set.
> >
> > ...
> >
> > --- a/mm/dmapool.c
> > +++ b/mm/dmapool.c
> > @@ -334,7 +334,7 @@ void *dma_pool_alloc(struct dma_pool *pool, gfp_t mem_flags,
> > /* pool_alloc_page() might sleep, so temporarily drop &pool->lock */
> > spin_unlock_irqrestore(&pool->lock, flags);
> >
> > - page = pool_alloc_page(pool, mem_flags);
> > + page = pool_alloc_page(pool, mem_flags & (~__GFP_ZERO));
> > if (!page)
> > return NULL;
> >
> > @@ -375,6 +375,10 @@ void *dma_pool_alloc(struct dma_pool *pool, gfp_t mem_flags,
> > memset(retval, POOL_POISON_ALLOCATED, pool->size);
> > #endif
> > spin_unlock_irqrestore(&pool->lock, flags);
> > +
> > + if (mem_flags & __GFP_ZERO)
> > + memset(retval, 0, pool->size);
> > +
> > return retval;
> > }
> > EXPORT_SYMBOL(dma_pool_alloc);
>
> hm, this code is all a bit confused.
>
> We'd really prefer that the __GFP_ZERO be passed all the way to the
> bottom level, so that places which are responsible for zeroing memory
> (eg, the page allocator) can do their designated function. One reason
> for this is that if someone comes up with a whizzy way of zeroing
> memory on their architecture (eg, non-temporal store) then that will be
> implemented in the core page allocator and the dma code will miss out.

It would be nice if we could use the page allocator for whizzy zeroing.
There are a few reasons why I didn't pass __GFP_ZERO down to the allocator:

- dma_pool_alloc() reuses blocks of memory that were recently freed by dma_pool_free().
We have to memset(0) old blocks, since we don't know what's in them.

- When a new page is alloced, pool_initalize_page() writes an integer to every block.
So even if we passed __GFP_ZERO down to the allocator, the block would not be empty
by the time dma_pool_alloc() returns.

- Assuming a driver is allocing as often as it is freeing,
once the pool has enough memory it shouldn't call down to the allocator very often,
so any optimization down in the allocator shouldn't make much of a difference

> Also, and just from a brief look around,
> drivers/base/dma-coherent.c:dma_alloc_from_coherent() is already
> zeroing the memory so under some circumstances I think we'll zero the
> memory twice? We could fix that by passing the gfp_t to
> dma_alloc_from_coherent() and then changing dma_alloc_from_coherent()
> to *not* zero the memory if __GFP_ZERO, but wouldn't that be peculiar?

I noticed this as well. In this case, we would be zeroing twice.
This is no worse than the current case (where dma_pool_alloc() returns,
then the driver calls memset(0)).

> Also, passing __GFP_ZERO will now cause pool_alloc_page()'s
> memset(POOL_POISON_FREED) to be wiped out. I guess that's harmless,
> but a bit inefficient?

Inefficient, but no more inefficient than the current case.
I didn't think it would be a problem (since it only happens if dma pool debuging is enabled).
I could add a check to only memset the poison if __GFP_ZERO is not set.

2015-07-21 19:38:27

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH 3/4] pci: mm: Add pci_pool_zalloc() call

On Wed, Jul 15, 2015 at 02:14:42PM -0700, Sean O. Stalley wrote:
> Add a wrapper function for pci_pool_alloc() to get zeroed memory.
>
> Signed-off-by: Sean O. Stalley <[email protected]>

If you get details of managing __GFP_ZERO worked out, I'm fine with this
PCI part of it, and you can merge it along with the rest of the series:

Acked-by: Bjorn Helgaas <[email protected]>

Please capitalize "PCI" in the subject line, like this:

PCI: mm: Add pci_pool_zalloc() call

> ---
> include/linux/pci.h | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 755a2cd..e6ec7d9 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -1176,6 +1176,8 @@ int pci_set_vga_state(struct pci_dev *pdev, bool decode,
> dma_pool_create(name, &pdev->dev, size, align, allocation)
> #define pci_pool_destroy(pool) dma_pool_destroy(pool)
> #define pci_pool_alloc(pool, flags, handle) dma_pool_alloc(pool, flags, handle)
> +#define pci_pool_zalloc(pool, flags, handle) \
> + dma_pool_zalloc(pool, flags, handle)
> #define pci_pool_free(pool, vaddr, addr) dma_pool_free(pool, vaddr, addr)
>
> enum pci_dma_burst_strategy {
> --
> 1.9.1
>