The first 21 patches in the set add support for the DMA attribute
DMA_ATTR_SKIP_CPU_SYNC on multiple platforms/architectures. This is needed
so that we can flag the calls to dma_map/unmap_page so that we do not
invalidate cache lines that do not currently belong to the device. Instead
we have to take care of this in the driver via a call to
sync_single_range_for_cpu prior to freeing the Rx page.
Patch 22 adds support for dma_map_page_attrs and dma_unmap_page_attrs so
that we can unmap and map a page using the DMA_ATTR_SKIP_CPU_SYNC
attribute.
Patch 23 adds support for freeing a page that has multiple references being
held by a single caller. This way we can free page fragments that were
allocated by a given driver.
The last 3 patches use these updates in the igb driver to allow for us to
reimplement the use of build_skb which hands a writable page off to the
stack.
My hope is to get the series accepted into the net-next tree as I have a
number of other Intel drivers I could then begin updating once these
patches are accepted.
Any feedback is welcome. Specifically if there is something I overlooked
design-wise or an architecture I missed please let me know and I will add
it to this patch set. If needed I can look into breaking this into a
smaller set of patches but this set is all that should be needed to then
start looking at putting together a DMA page pool per device which I know
is something Jesper has been working on.
---
Alexander Duyck (26):
swiotlb: Drop unused function swiotlb_map_sg
swiotlb: Add support for DMA_ATTR_SKIP_CPU_SYNC
arch/arc: Add option to skip sync on DMA mapping
arch/arm: Add option to skip sync on DMA map and unmap
arch/avr32: Add option to skip sync on DMA map
arch/blackfin: Add option to skip sync on DMA map
arch/c6x: Add option to skip sync on DMA map and unmap
arch/frv: Add option to skip sync on DMA map
arch/hexagon: Add option to skip DMA sync as a part of mapping
arch/m68k: Add option to skip DMA sync as a part of mapping
arch/metag: Add option to skip DMA sync as a part of map and unmap
arch/microblaze: Add option to skip DMA sync as a part of map and unmap
arch/mips: Add option to skip DMA sync as a part of map and unmap
arch/nios2: Add option to skip DMA sync as a part of map and unmap
arch/openrisc: Add option to skip DMA sync as a part of mapping
arch/parisc: Add option to skip DMA sync as a part of map and unmap
arch/powerpc: Add option to skip DMA sync as a part of mapping
arch/sh: Add option to skip DMA sync as a part of mapping
arch/sparc: Add option to skip DMA sync as a part of map and unmap
arch/tile: Add option to skip DMA sync as a part of map and unmap
arch/xtensa: Add option to skip DMA sync as a part of mapping
dma: Add calls for dma_map_page_attrs and dma_unmap_page_attrs
mm: Add support for releasing multiple instances of a page
igb: Update driver to make use of DMA_ATTR_SKIP_CPU_SYNC
igb: Update code to better handle incrementing page count
igb: Revert "igb: Revert support for build_skb in igb"
arch/arc/mm/dma.c | 3
arch/arm/common/dmabounce.c | 16 +-
arch/avr32/mm/dma-coherent.c | 7 +
arch/blackfin/kernel/dma-mapping.c | 7 +
arch/c6x/kernel/dma.c | 16 ++
arch/frv/mb93090-mb00/pci-dma-nommu.c | 16 ++
arch/frv/mb93090-mb00/pci-dma.c | 7 +
arch/hexagon/kernel/dma.c | 6 +
arch/m68k/kernel/dma.c | 8 +
arch/metag/kernel/dma.c | 16 ++
arch/microblaze/kernel/dma.c | 10 +
arch/mips/loongson64/common/dma-swiotlb.c | 2
arch/mips/mm/dma-default.c | 8 +
arch/nios2/mm/dma-mapping.c | 14 ++
arch/openrisc/kernel/dma.c | 3
arch/parisc/kernel/pci-dma.c | 20 ++-
arch/powerpc/kernel/dma.c | 9 +
arch/sh/kernel/dma-nommu.c | 7 +
arch/sparc/kernel/iommu.c | 4 -
arch/sparc/kernel/ioport.c | 4 -
arch/tile/kernel/pci-dma.c | 12 +-
arch/xtensa/kernel/pci-dma.c | 7 +
drivers/net/ethernet/intel/igb/igb.h | 36 ++++-
drivers/net/ethernet/intel/igb/igb_main.c | 207 +++++++++++++++++++++++------
drivers/xen/swiotlb-xen.c | 40 +++---
include/linux/dma-mapping.h | 20 ++-
include/linux/gfp.h | 2
include/linux/swiotlb.h | 10 +
lib/swiotlb.c | 56 ++++----
mm/page_alloc.c | 14 ++
30 files changed, 435 insertions(+), 152 deletions(-)
--
Signature
The use of DMA_ATTR_SKIP_CPU_SYNC was not consistent across all of the DMA
APIs in the arch/arm folder. This change is meant to correct that so that
we get consistent behavior.
Cc: Russell King <[email protected]>
Signed-off-by: Alexander Duyck <[email protected]>
---
arch/arm/common/dmabounce.c | 16 ++++++++++------
1 file changed, 10 insertions(+), 6 deletions(-)
diff --git a/arch/arm/common/dmabounce.c b/arch/arm/common/dmabounce.c
index 3012816..75055df 100644
--- a/arch/arm/common/dmabounce.c
+++ b/arch/arm/common/dmabounce.c
@@ -243,7 +243,8 @@ static int needs_bounce(struct device *dev, dma_addr_t dma_addr, size_t size)
}
static inline dma_addr_t map_single(struct device *dev, void *ptr, size_t size,
- enum dma_data_direction dir)
+ enum dma_data_direction dir,
+ unsigned long attrs)
{
struct dmabounce_device_info *device_info = dev->archdata.dmabounce;
struct safe_buffer *buf;
@@ -262,7 +263,8 @@ static inline dma_addr_t map_single(struct device *dev, void *ptr, size_t size,
__func__, buf->ptr, virt_to_dma(dev, buf->ptr),
buf->safe, buf->safe_dma_addr);
- if (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL) {
+ if ((dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL) &&
+ !(attrs & DMA_ATTR_SKIP_CPU_SYNC)) {
dev_dbg(dev, "%s: copy unsafe %p to safe %p, size %d\n",
__func__, ptr, buf->safe, size);
memcpy(buf->safe, ptr, size);
@@ -272,7 +274,8 @@ static inline dma_addr_t map_single(struct device *dev, void *ptr, size_t size,
}
static inline void unmap_single(struct device *dev, struct safe_buffer *buf,
- size_t size, enum dma_data_direction dir)
+ size_t size, enum dma_data_direction dir,
+ unsigned long attrs)
{
BUG_ON(buf->size != size);
BUG_ON(buf->direction != dir);
@@ -283,7 +286,8 @@ static inline void unmap_single(struct device *dev, struct safe_buffer *buf,
DO_STATS(dev->archdata.dmabounce->bounce_count++);
- if (dir == DMA_FROM_DEVICE || dir == DMA_BIDIRECTIONAL) {
+ if ((dir == DMA_FROM_DEVICE || dir == DMA_BIDIRECTIONAL) &&
+ !(attrs & DMA_ATTR_SKIP_CPU_SYNC)) {
void *ptr = buf->ptr;
dev_dbg(dev, "%s: copy back safe %p to unsafe %p size %d\n",
@@ -334,7 +338,7 @@ static dma_addr_t dmabounce_map_page(struct device *dev, struct page *page,
return DMA_ERROR_CODE;
}
- return map_single(dev, page_address(page) + offset, size, dir);
+ return map_single(dev, page_address(page) + offset, size, dir, attrs);
}
/*
@@ -357,7 +361,7 @@ static void dmabounce_unmap_page(struct device *dev, dma_addr_t dma_addr, size_t
return;
}
- unmap_single(dev, buf, size, dir);
+ unmap_single(dev, buf, size, dir, attrs);
}
static int __dmabounce_sync_for_cpu(struct device *dev, dma_addr_t addr,
The use of DMA_ATTR_SKIP_CPU_SYNC was not consistent across all of the DMA
APIs in the arch/arm folder. This change is meant to correct that so that
we get consistent behavior.
Cc: Haavard Skinnemoen <[email protected]>
Cc: Hans-Christian Egtvedt <[email protected]>
Signed-off-by: Alexander Duyck <[email protected]>
---
arch/avr32/mm/dma-coherent.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/arch/avr32/mm/dma-coherent.c b/arch/avr32/mm/dma-coherent.c
index 58610d0..54534e5 100644
--- a/arch/avr32/mm/dma-coherent.c
+++ b/arch/avr32/mm/dma-coherent.c
@@ -146,7 +146,8 @@ static dma_addr_t avr32_dma_map_page(struct device *dev, struct page *page,
{
void *cpu_addr = page_address(page) + offset;
- dma_cache_sync(dev, cpu_addr, size, direction);
+ if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
+ dma_cache_sync(dev, cpu_addr, size, direction);
return virt_to_bus(cpu_addr);
}
@@ -162,6 +163,10 @@ static int avr32_dma_map_sg(struct device *dev, struct scatterlist *sglist,
sg->dma_address = page_to_bus(sg_page(sg)) + sg->offset;
virt = sg_virt(sg);
+
+ if (attrs & DMA_ATTR_SKIP_CPU_SYNC)
+ continue;
+
dma_cache_sync(dev, virt, sg->length, direction);
}
This change allows us to pass DMA_ATTR_SKIP_CPU_SYNC which allows us to
avoid invoking cache line invalidation if the driver will just handle it
later via a sync_for_cpu or sync_for_device call.
Cc: Mark Salter <[email protected]>
Cc: Aurelien Jacquiot <[email protected]>
Cc: [email protected]
Signed-off-by: Alexander Duyck <[email protected]>
---
arch/c6x/kernel/dma.c | 16 +++++++++++-----
1 file changed, 11 insertions(+), 5 deletions(-)
diff --git a/arch/c6x/kernel/dma.c b/arch/c6x/kernel/dma.c
index db4a6a3..d28df74 100644
--- a/arch/c6x/kernel/dma.c
+++ b/arch/c6x/kernel/dma.c
@@ -42,14 +42,17 @@ static dma_addr_t c6x_dma_map_page(struct device *dev, struct page *page,
{
dma_addr_t handle = virt_to_phys(page_address(page) + offset);
- c6x_dma_sync(handle, size, dir);
+ if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
+ c6x_dma_sync(handle, size, dir);
+
return handle;
}
static void c6x_dma_unmap_page(struct device *dev, dma_addr_t handle,
size_t size, enum dma_data_direction dir, unsigned long attrs)
{
- c6x_dma_sync(handle, size, dir);
+ if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
+ c6x_dma_sync(handle, size, dir);
}
static int c6x_dma_map_sg(struct device *dev, struct scatterlist *sglist,
@@ -60,7 +63,8 @@ static int c6x_dma_map_sg(struct device *dev, struct scatterlist *sglist,
for_each_sg(sglist, sg, nents, i) {
sg->dma_address = sg_phys(sg);
- c6x_dma_sync(sg->dma_address, sg->length, dir);
+ if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
+ c6x_dma_sync(sg->dma_address, sg->length, dir);
}
return nents;
@@ -72,8 +76,10 @@ static void c6x_dma_unmap_sg(struct device *dev, struct scatterlist *sglist,
struct scatterlist *sg;
int i;
- for_each_sg(sglist, sg, nents, i)
- c6x_dma_sync(sg_dma_address(sg), sg->length, dir);
+ for_each_sg(sglist, sg, nents, i) {
+ if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
+ c6x_dma_sync(sg_dma_address(sg), sg->length, dir);
+ }
}
The use of DMA_ATTR_SKIP_CPU_SYNC was not consistent across all of the DMA
APIs in the arch/arm folder. This change is meant to correct that so that
we get consistent behavior.
Cc: Steven Miao <[email protected]>
Signed-off-by: Alexander Duyck <[email protected]>
---
arch/blackfin/kernel/dma-mapping.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/arch/blackfin/kernel/dma-mapping.c b/arch/blackfin/kernel/dma-mapping.c
index 53fbbb6..ed9a6a8 100644
--- a/arch/blackfin/kernel/dma-mapping.c
+++ b/arch/blackfin/kernel/dma-mapping.c
@@ -133,6 +133,10 @@ static void bfin_dma_sync_sg_for_device(struct device *dev,
for_each_sg(sg_list, sg, nelems, i) {
sg->dma_address = (dma_addr_t) sg_virt(sg);
+
+ if (attrs & DMA_ATTR_SKIP_CPU_SYNC)
+ continue;
+
__dma_sync(sg_dma_address(sg), sg_dma_len(sg), direction);
}
}
@@ -143,7 +147,8 @@ static dma_addr_t bfin_dma_map_page(struct device *dev, struct page *page,
{
dma_addr_t handle = (dma_addr_t)(page_address(page) + offset);
- _dma_sync(handle, size, dir);
+ if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
+ _dma_sync(handle, size, dir);
return handle;
}
This change allows us to pass DMA_ATTR_SKIP_CPU_SYNC which allows us to
avoid invoking cache line invalidation if the driver will just handle it
later via a sync_for_cpu or sync_for_device call.
Cc: Geert Uytterhoeven <[email protected]>
Cc: [email protected]
Signed-off-by: Alexander Duyck <[email protected]>
---
arch/m68k/kernel/dma.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/arch/m68k/kernel/dma.c b/arch/m68k/kernel/dma.c
index 8cf97cb..0707006 100644
--- a/arch/m68k/kernel/dma.c
+++ b/arch/m68k/kernel/dma.c
@@ -134,7 +134,9 @@ static dma_addr_t m68k_dma_map_page(struct device *dev, struct page *page,
{
dma_addr_t handle = page_to_phys(page) + offset;
- dma_sync_single_for_device(dev, handle, size, dir);
+ if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
+ dma_sync_single_for_device(dev, handle, size, dir);
+
return handle;
}
@@ -146,6 +148,10 @@ static int m68k_dma_map_sg(struct device *dev, struct scatterlist *sglist,
for_each_sg(sglist, sg, nents, i) {
sg->dma_address = sg_phys(sg);
+
+ if (attrs & DMA_ATTR_SKIP_CPU_SYNC)
+ continue;
+
dma_sync_single_for_device(dev, sg->dma_address, sg->length,
dir);
}
This change allows us to pass DMA_ATTR_SKIP_CPU_SYNC which allows us to
avoid invoking cache line invalidation if the driver will just handle it
via a sync_for_cpu or sync_for_device call.
Cc: James Hogan <[email protected]>
Cc: [email protected]
Signed-off-by: Alexander Duyck <[email protected]>
---
arch/metag/kernel/dma.c | 16 +++++++++++++---
1 file changed, 13 insertions(+), 3 deletions(-)
diff --git a/arch/metag/kernel/dma.c b/arch/metag/kernel/dma.c
index 0db31e2..91968d9 100644
--- a/arch/metag/kernel/dma.c
+++ b/arch/metag/kernel/dma.c
@@ -484,8 +484,9 @@ static dma_addr_t metag_dma_map_page(struct device *dev, struct page *page,
unsigned long offset, size_t size,
enum dma_data_direction direction, unsigned long attrs)
{
- dma_sync_for_device((void *)(page_to_phys(page) + offset), size,
- direction);
+ if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
+ dma_sync_for_device((void *)(page_to_phys(page) + offset),
+ size, direction);
return page_to_phys(page) + offset;
}
@@ -493,7 +494,8 @@ static void metag_dma_unmap_page(struct device *dev, dma_addr_t dma_address,
size_t size, enum dma_data_direction direction,
unsigned long attrs)
{
- dma_sync_for_cpu(phys_to_virt(dma_address), size, direction);
+ if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
+ dma_sync_for_cpu(phys_to_virt(dma_address), size, direction);
}
static int metag_dma_map_sg(struct device *dev, struct scatterlist *sglist,
@@ -507,6 +509,10 @@ static int metag_dma_map_sg(struct device *dev, struct scatterlist *sglist,
BUG_ON(!sg_page(sg));
sg->dma_address = sg_phys(sg);
+
+ if (attrs & DMA_ATTR_SKIP_CPU_SYNC)
+ continue;
+
dma_sync_for_device(sg_virt(sg), sg->length, direction);
}
@@ -525,6 +531,10 @@ static void metag_dma_unmap_sg(struct device *dev, struct scatterlist *sglist,
BUG_ON(!sg_page(sg));
sg->dma_address = sg_phys(sg);
+
+ if (attrs & DMA_ATTR_SKIP_CPU_SYNC)
+ continue;
+
dma_sync_for_cpu(sg_virt(sg), sg->length, direction);
}
}
This change allows us to pass DMA_ATTR_SKIP_CPU_SYNC which allows us to
avoid invoking cache line invalidation if the driver will just handle it
via a sync_for_cpu or sync_for_device call.
Cc: Ralf Baechle <[email protected]>
Cc: Keguang Zhang <[email protected]>
Cc: [email protected]
Signed-off-by: Alexander Duyck <[email protected]>
---
arch/mips/loongson64/common/dma-swiotlb.c | 2 +-
arch/mips/mm/dma-default.c | 8 +++++---
2 files changed, 6 insertions(+), 4 deletions(-)
diff --git a/arch/mips/loongson64/common/dma-swiotlb.c b/arch/mips/loongson64/common/dma-swiotlb.c
index 1a80b6f..aab4fd6 100644
--- a/arch/mips/loongson64/common/dma-swiotlb.c
+++ b/arch/mips/loongson64/common/dma-swiotlb.c
@@ -61,7 +61,7 @@ static int loongson_dma_map_sg(struct device *dev, struct scatterlist *sg,
int nents, enum dma_data_direction dir,
unsigned long attrs)
{
- int r = swiotlb_map_sg_attrs(dev, sg, nents, dir, 0);
+ int r = swiotlb_map_sg_attrs(dev, sg, nents, dir, attrs);
mb();
return r;
diff --git a/arch/mips/mm/dma-default.c b/arch/mips/mm/dma-default.c
index b2eadd6..dd998d7 100644
--- a/arch/mips/mm/dma-default.c
+++ b/arch/mips/mm/dma-default.c
@@ -293,7 +293,7 @@ static inline void __dma_sync(struct page *page,
static void mips_dma_unmap_page(struct device *dev, dma_addr_t dma_addr,
size_t size, enum dma_data_direction direction, unsigned long attrs)
{
- if (cpu_needs_post_dma_flush(dev))
+ if (cpu_needs_post_dma_flush(dev) && !(attrs & DMA_ATTR_SKIP_CPU_SYNC))
__dma_sync(dma_addr_to_page(dev, dma_addr),
dma_addr & ~PAGE_MASK, size, direction);
plat_post_dma_flush(dev);
@@ -307,7 +307,8 @@ static int mips_dma_map_sg(struct device *dev, struct scatterlist *sglist,
struct scatterlist *sg;
for_each_sg(sglist, sg, nents, i) {
- if (!plat_device_is_coherent(dev))
+ if (!plat_device_is_coherent(dev) &&
+ !(attrs & DMA_ATTR_SKIP_CPU_SYNC))
__dma_sync(sg_page(sg), sg->offset, sg->length,
direction);
#ifdef CONFIG_NEED_SG_DMA_LENGTH
@@ -324,7 +325,7 @@ static dma_addr_t mips_dma_map_page(struct device *dev, struct page *page,
unsigned long offset, size_t size, enum dma_data_direction direction,
unsigned long attrs)
{
- if (!plat_device_is_coherent(dev))
+ if (!plat_device_is_coherent(dev) && !(attrs & DMA_ATTR_SKIP_CPU_SYNC))
__dma_sync(page, offset, size, direction);
return plat_map_dma_mem_page(dev, page) + offset;
@@ -339,6 +340,7 @@ static void mips_dma_unmap_sg(struct device *dev, struct scatterlist *sglist,
for_each_sg(sglist, sg, nhwentries, i) {
if (!plat_device_is_coherent(dev) &&
+ !(attrs & DMA_ATTR_SKIP_CPU_SYNC) &&
direction != DMA_TO_DEVICE)
__dma_sync(sg_page(sg), sg->offset, sg->length,
direction);
This change allows us to pass DMA_ATTR_SKIP_CPU_SYNC which allows us to
avoid invoking cache line invalidation if the driver will just handle it
via a sync_for_cpu or sync_for_device call.
Cc: Michal Simek <[email protected]>
Signed-off-by: Alexander Duyck <[email protected]>
---
arch/microblaze/kernel/dma.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/arch/microblaze/kernel/dma.c b/arch/microblaze/kernel/dma.c
index ec04dc1..818daf2 100644
--- a/arch/microblaze/kernel/dma.c
+++ b/arch/microblaze/kernel/dma.c
@@ -61,6 +61,10 @@ static int dma_direct_map_sg(struct device *dev, struct scatterlist *sgl,
/* FIXME this part of code is untested */
for_each_sg(sgl, sg, nents, i) {
sg->dma_address = sg_phys(sg);
+
+ if (attrs & DMA_ATTR_SKIP_CPU_SYNC)
+ continue;
+
__dma_sync(page_to_phys(sg_page(sg)) + sg->offset,
sg->length, direction);
}
@@ -80,7 +84,8 @@ static inline dma_addr_t dma_direct_map_page(struct device *dev,
enum dma_data_direction direction,
unsigned long attrs)
{
- __dma_sync(page_to_phys(page) + offset, size, direction);
+ if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
+ __dma_sync(page_to_phys(page) + offset, size, direction);
return page_to_phys(page) + offset;
}
@@ -95,7 +100,8 @@ static inline void dma_direct_unmap_page(struct device *dev,
* phys_to_virt is here because in __dma_sync_page is __virt_to_phys and
* dma_address is physical address
*/
- __dma_sync(dma_address, size, direction);
+ if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
+ __dma_sync(dma_address, size, direction);
}
static inline void
This change allows us to pass DMA_ATTR_SKIP_CPU_SYNC which allows us to
avoid invoking cache line invalidation if the driver will just handle it
via a sync_for_cpu or sync_for_device call.
Cc: Ley Foon Tan <[email protected]>
Signed-off-by: Alexander Duyck <[email protected]>
---
arch/nios2/mm/dma-mapping.c | 14 +++++++++++---
1 file changed, 11 insertions(+), 3 deletions(-)
diff --git a/arch/nios2/mm/dma-mapping.c b/arch/nios2/mm/dma-mapping.c
index d800fad..b83e723 100644
--- a/arch/nios2/mm/dma-mapping.c
+++ b/arch/nios2/mm/dma-mapping.c
@@ -102,7 +102,9 @@ static int nios2_dma_map_sg(struct device *dev, struct scatterlist *sg,
addr = sg_virt(sg);
if (addr) {
- __dma_sync_for_device(addr, sg->length, direction);
+ if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
+ __dma_sync_for_device(addr, sg->length,
+ direction);
sg->dma_address = sg_phys(sg);
}
}
@@ -117,7 +119,9 @@ static dma_addr_t nios2_dma_map_page(struct device *dev, struct page *page,
{
void *addr = page_address(page) + offset;
- __dma_sync_for_device(addr, size, direction);
+ if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
+ __dma_sync_for_device(addr, size, direction);
+
return page_to_phys(page) + offset;
}
@@ -125,7 +129,8 @@ static void nios2_dma_unmap_page(struct device *dev, dma_addr_t dma_address,
size_t size, enum dma_data_direction direction,
unsigned long attrs)
{
- __dma_sync_for_cpu(phys_to_virt(dma_address), size, direction);
+ if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
+ __dma_sync_for_cpu(phys_to_virt(dma_address), size, direction);
}
static void nios2_dma_unmap_sg(struct device *dev, struct scatterlist *sg,
@@ -138,6 +143,9 @@ static void nios2_dma_unmap_sg(struct device *dev, struct scatterlist *sg,
if (direction == DMA_TO_DEVICE)
return;
+ if (attrs & DMA_ATTR_SKIP_CPU_SYNC)
+ return;
+
for_each_sg(sg, sg, nhwentries, i) {
addr = sg_virt(sg);
if (addr)
This change allows us to pass DMA_ATTR_SKIP_CPU_SYNC which allows us to
avoid invoking cache line invalidation if the driver will just handle it
via a sync_for_cpu or sync_for_device call.
Cc: "James E.J. Bottomley" <[email protected]>
Cc: Helge Deller <[email protected]>
Cc: [email protected]
Signed-off-by: Alexander Duyck <[email protected]>
---
arch/parisc/kernel/pci-dma.c | 20 +++++++++++++++-----
1 file changed, 15 insertions(+), 5 deletions(-)
diff --git a/arch/parisc/kernel/pci-dma.c b/arch/parisc/kernel/pci-dma.c
index 02d9ed0..be55ede 100644
--- a/arch/parisc/kernel/pci-dma.c
+++ b/arch/parisc/kernel/pci-dma.c
@@ -459,7 +459,9 @@ static dma_addr_t pa11_dma_map_page(struct device *dev, struct page *page,
void *addr = page_address(page) + offset;
BUG_ON(direction == DMA_NONE);
- flush_kernel_dcache_range((unsigned long) addr, size);
+ if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
+ flush_kernel_dcache_range((unsigned long) addr, size);
+
return virt_to_phys(addr);
}
@@ -469,8 +471,11 @@ static void pa11_dma_unmap_page(struct device *dev, dma_addr_t dma_handle,
{
BUG_ON(direction == DMA_NONE);
+ if (attrs & DMA_ATTR_SKIP_CPU_SYNC)
+ return;
+
if (direction == DMA_TO_DEVICE)
- return;
+ return;
/*
* For PCI_DMA_FROMDEVICE this flush is not necessary for the
@@ -479,7 +484,6 @@ static void pa11_dma_unmap_page(struct device *dev, dma_addr_t dma_handle,
*/
flush_kernel_dcache_range((unsigned long) phys_to_virt(dma_handle), size);
- return;
}
static int pa11_dma_map_sg(struct device *dev, struct scatterlist *sglist,
@@ -496,6 +500,10 @@ static int pa11_dma_map_sg(struct device *dev, struct scatterlist *sglist,
sg_dma_address(sg) = (dma_addr_t) virt_to_phys(vaddr);
sg_dma_len(sg) = sg->length;
+
+ if (attrs & DMA_ATTR_SKIP_CPU_SYNC)
+ continue;
+
flush_kernel_dcache_range(vaddr, sg->length);
}
return nents;
@@ -510,14 +518,16 @@ static void pa11_dma_unmap_sg(struct device *dev, struct scatterlist *sglist,
BUG_ON(direction == DMA_NONE);
+ if (attrs & DMA_ATTR_SKIP_CPU_SYNC)
+ return;
+
if (direction == DMA_TO_DEVICE)
- return;
+ return;
/* once we do combining we'll need to use phys_to_virt(sg_dma_address(sglist)) */
for_each_sg(sglist, sg, nents, i)
flush_kernel_vmap_range(sg_virt(sg), sg->length);
- return;
}
static void pa11_dma_sync_single_for_cpu(struct device *dev,
This change allows us to pass DMA_ATTR_SKIP_CPU_SYNC which allows us to
avoid invoking cache line invalidation if the driver will just handle it
via a sync_for_cpu or sync_for_device call.
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: [email protected]
Signed-off-by: Alexander Duyck <[email protected]>
---
arch/powerpc/kernel/dma.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/arch/powerpc/kernel/dma.c b/arch/powerpc/kernel/dma.c
index e64a601..6877e3f 100644
--- a/arch/powerpc/kernel/dma.c
+++ b/arch/powerpc/kernel/dma.c
@@ -203,6 +203,10 @@ static int dma_direct_map_sg(struct device *dev, struct scatterlist *sgl,
for_each_sg(sgl, sg, nents, i) {
sg->dma_address = sg_phys(sg) + get_dma_offset(dev);
sg->dma_length = sg->length;
+
+ if (attrs & DMA_ATTR_SKIP_CPU_SYNC)
+ continue;
+
__dma_sync_page(sg_page(sg), sg->offset, sg->length, direction);
}
@@ -235,7 +239,10 @@ static inline dma_addr_t dma_direct_map_page(struct device *dev,
unsigned long attrs)
{
BUG_ON(dir == DMA_NONE);
- __dma_sync_page(page, offset, size, dir);
+
+ if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
+ __dma_sync_page(page, offset, size, dir);
+
return page_to_phys(page) + offset + get_dma_offset(dev);
}
This change allows us to pass DMA_ATTR_SKIP_CPU_SYNC which allows us to
avoid invoking cache line invalidation if the driver will just handle it
via a sync_for_cpu or sync_for_device call.
Cc: "David S. Miller" <[email protected]>
Cc: [email protected]
Signed-off-by: Alexander Duyck <[email protected]>
---
arch/sparc/kernel/iommu.c | 4 ++--
arch/sparc/kernel/ioport.c | 4 ++--
2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/arch/sparc/kernel/iommu.c b/arch/sparc/kernel/iommu.c
index 5c615ab..8fda4e4 100644
--- a/arch/sparc/kernel/iommu.c
+++ b/arch/sparc/kernel/iommu.c
@@ -415,7 +415,7 @@ static void dma_4u_unmap_page(struct device *dev, dma_addr_t bus_addr,
ctx = (iopte_val(*base) & IOPTE_CONTEXT) >> 47UL;
/* Step 1: Kick data out of streaming buffers if necessary. */
- if (strbuf->strbuf_enabled)
+ if (strbuf->strbuf_enabled && !(attrs & DMA_ATTR_SKIP_CPU_SYNC))
strbuf_flush(strbuf, iommu, bus_addr, ctx,
npages, direction);
@@ -640,7 +640,7 @@ static void dma_4u_unmap_sg(struct device *dev, struct scatterlist *sglist,
base = iommu->page_table + entry;
dma_handle &= IO_PAGE_MASK;
- if (strbuf->strbuf_enabled)
+ if (strbuf->strbuf_enabled && !(attrs & DMA_ATTR_SKIP_CPU_SYNC))
strbuf_flush(strbuf, iommu, dma_handle, ctx,
npages, direction);
diff --git a/arch/sparc/kernel/ioport.c b/arch/sparc/kernel/ioport.c
index 2344103..6ffaec4 100644
--- a/arch/sparc/kernel/ioport.c
+++ b/arch/sparc/kernel/ioport.c
@@ -527,7 +527,7 @@ static dma_addr_t pci32_map_page(struct device *dev, struct page *page,
static void pci32_unmap_page(struct device *dev, dma_addr_t ba, size_t size,
enum dma_data_direction dir, unsigned long attrs)
{
- if (dir != PCI_DMA_TODEVICE)
+ if (dir != PCI_DMA_TODEVICE && !(attrs & DMA_ATTR_SKIP_CPU_SYNC))
dma_make_coherent(ba, PAGE_ALIGN(size));
}
@@ -572,7 +572,7 @@ static void pci32_unmap_sg(struct device *dev, struct scatterlist *sgl,
struct scatterlist *sg;
int n;
- if (dir != PCI_DMA_TODEVICE) {
+ if (dir != PCI_DMA_TODEVICE && !(attrs & DMA_ATTR_SKIP_CPU_SYNC)) {
for_each_sg(sgl, sg, nents, n) {
dma_make_coherent(sg_phys(sg), PAGE_ALIGN(sg->length));
}
This change allows us to pass DMA_ATTR_SKIP_CPU_SYNC which allows us to
avoid invoking cache line invalidation if the driver will just handle it
via a sync_for_cpu or sync_for_device call.
Cc: Chris Metcalf <[email protected]>
Signed-off-by: Alexander Duyck <[email protected]>
---
arch/tile/kernel/pci-dma.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/arch/tile/kernel/pci-dma.c b/arch/tile/kernel/pci-dma.c
index 09bb774..24e0f8c 100644
--- a/arch/tile/kernel/pci-dma.c
+++ b/arch/tile/kernel/pci-dma.c
@@ -213,10 +213,12 @@ static int tile_dma_map_sg(struct device *dev, struct scatterlist *sglist,
for_each_sg(sglist, sg, nents, i) {
sg->dma_address = sg_phys(sg);
- __dma_prep_pa_range(sg->dma_address, sg->length, direction);
#ifdef CONFIG_NEED_SG_DMA_LENGTH
sg->dma_length = sg->length;
#endif
+ if (attrs & DMA_ATTR_SKIP_CPU_SYNC)
+ continue;
+ __dma_prep_pa_range(sg->dma_address, sg->length, direction);
}
return nents;
@@ -232,6 +234,8 @@ static void tile_dma_unmap_sg(struct device *dev, struct scatterlist *sglist,
BUG_ON(!valid_dma_direction(direction));
for_each_sg(sglist, sg, nents, i) {
sg->dma_address = sg_phys(sg);
+ if (attrs & DMA_ATTR_SKIP_CPU_SYNC)
+ continue;
__dma_complete_pa_range(sg->dma_address, sg->length,
direction);
}
@@ -245,7 +249,8 @@ static dma_addr_t tile_dma_map_page(struct device *dev, struct page *page,
BUG_ON(!valid_dma_direction(direction));
BUG_ON(offset + size > PAGE_SIZE);
- __dma_prep_page(page, offset, size, direction);
+ if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
+ __dma_prep_page(page, offset, size, direction);
return page_to_pa(page) + offset;
}
@@ -256,6 +261,9 @@ static void tile_dma_unmap_page(struct device *dev, dma_addr_t dma_address,
{
BUG_ON(!valid_dma_direction(direction));
+ if (attrs & DMA_ATTR_SKIP_CPU_SYNC)
+ return;
+
__dma_complete_page(pfn_to_page(PFN_DOWN(dma_address)),
dma_address & (PAGE_SIZE - 1), size, direction);
}
This change allows us to pass DMA_ATTR_SKIP_CPU_SYNC which allows us to
avoid invoking cache line invalidation if the driver will just handle it
via a sync_for_cpu or sync_for_device call.
Cc: Yoshinori Sato <[email protected]>
Cc: Rich Felker <[email protected]>
Cc: [email protected]
Signed-off-by: Alexander Duyck <[email protected]>
---
arch/sh/kernel/dma-nommu.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/arch/sh/kernel/dma-nommu.c b/arch/sh/kernel/dma-nommu.c
index eadb669..47fee3b 100644
--- a/arch/sh/kernel/dma-nommu.c
+++ b/arch/sh/kernel/dma-nommu.c
@@ -18,7 +18,9 @@ static dma_addr_t nommu_map_page(struct device *dev, struct page *page,
dma_addr_t addr = page_to_phys(page) + offset;
WARN_ON(size == 0);
- dma_cache_sync(dev, page_address(page) + offset, size, dir);
+
+ if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
+ dma_cache_sync(dev, page_address(page) + offset, size, dir);
return addr;
}
@@ -35,7 +37,8 @@ static int nommu_map_sg(struct device *dev, struct scatterlist *sg,
for_each_sg(sg, s, nents, i) {
BUG_ON(!sg_page(s));
- dma_cache_sync(dev, sg_virt(s), s->length, dir);
+ if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
+ dma_cache_sync(dev, sg_virt(s), s->length, dir);
s->dma_address = sg_phys(s);
s->dma_length = s->length;
Add support for mapping and unmapping a page with attributes. The primary
use for this is currently to allow for us to pass the
DMA_ATTR_SKIP_CPU_SYNC attribute when mapping and unmapping a page. On
some architectures such as ARM the synchronization has significant overhead
and if we are already taking care of the sync_for_cpu and sync_for_device
from the driver there isn't much need to handle this in the map/unmap calls
as well.
Signed-off-by: Alexander Duyck <[email protected]>
---
include/linux/dma-mapping.h | 20 +++++++++++++-------
1 file changed, 13 insertions(+), 7 deletions(-)
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index 08528af..10c5a17 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -243,29 +243,33 @@ static inline void dma_unmap_sg_attrs(struct device *dev, struct scatterlist *sg
ops->unmap_sg(dev, sg, nents, dir, attrs);
}
-static inline dma_addr_t dma_map_page(struct device *dev, struct page *page,
- size_t offset, size_t size,
- enum dma_data_direction dir)
+static inline dma_addr_t dma_map_page_attrs(struct device *dev,
+ struct page *page,
+ size_t offset, size_t size,
+ enum dma_data_direction dir,
+ unsigned long attrs)
{
struct dma_map_ops *ops = get_dma_ops(dev);
dma_addr_t addr;
kmemcheck_mark_initialized(page_address(page) + offset, size);
BUG_ON(!valid_dma_direction(dir));
- addr = ops->map_page(dev, page, offset, size, dir, 0);
+ addr = ops->map_page(dev, page, offset, size, dir, attrs);
debug_dma_map_page(dev, page, offset, size, dir, addr, false);
return addr;
}
-static inline void dma_unmap_page(struct device *dev, dma_addr_t addr,
- size_t size, enum dma_data_direction dir)
+static inline void dma_unmap_page_attrs(struct device *dev,
+ dma_addr_t addr, size_t size,
+ enum dma_data_direction dir,
+ unsigned long attrs)
{
struct dma_map_ops *ops = get_dma_ops(dev);
BUG_ON(!valid_dma_direction(dir));
if (ops->unmap_page)
- ops->unmap_page(dev, addr, size, dir, 0);
+ ops->unmap_page(dev, addr, size, dir, attrs);
debug_dma_unmap_page(dev, addr, size, dir, false);
}
@@ -385,6 +389,8 @@ static inline void dma_sync_single_range_for_device(struct device *dev,
#define dma_unmap_single(d, a, s, r) dma_unmap_single_attrs(d, a, s, r, 0)
#define dma_map_sg(d, s, n, r) dma_map_sg_attrs(d, s, n, r, 0)
#define dma_unmap_sg(d, s, n, r) dma_unmap_sg_attrs(d, s, n, r, 0)
+#define dma_map_page(d, p, o, s, r) dma_map_page_attrs(d, p, o, s, r, 0)
+#define dma_unmap_page(d, a, s, r) dma_unmap_page_attrs(d, a, s, r, 0)
extern int dma_common_mmap(struct device *dev, struct vm_area_struct *vma,
void *cpu_addr, dma_addr_t dma_addr, size_t size);
This patch adds a function that allows us to batch free a page that has
multiple references outstanding. Specifically this function can be used to
drop a page being used in the page frag alloc cache. With this drivers can
make use of functionality similar to the page frag alloc cache without
having to do any workarounds for the fact that there is no function that
frees multiple references.
Cc: [email protected]
Signed-off-by: Alexander Duyck <[email protected]>
---
include/linux/gfp.h | 2 ++
mm/page_alloc.c | 14 ++++++++++++++
2 files changed, 16 insertions(+)
diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index f8041f9de..4175dca 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -506,6 +506,8 @@ extern struct page *alloc_pages_vma(gfp_t gfp_mask, int order,
extern void free_hot_cold_page_list(struct list_head *list, bool cold);
struct page_frag_cache;
+extern void __page_frag_drain(struct page *page, unsigned int order,
+ unsigned int count);
extern void *__alloc_page_frag(struct page_frag_cache *nc,
unsigned int fragsz, gfp_t gfp_mask);
extern void __free_page_frag(void *addr);
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index ca423cc..253046a 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3883,6 +3883,20 @@ static struct page *__page_frag_refill(struct page_frag_cache *nc,
return page;
}
+void __page_frag_drain(struct page *page, unsigned int order,
+ unsigned int count)
+{
+ VM_BUG_ON_PAGE(page_ref_count(page) == 0, page);
+
+ if (page_ref_sub_and_test(page, count)) {
+ if (order == 0)
+ free_hot_cold_page(page, false);
+ else
+ __free_pages_ok(page, order);
+ }
+}
+EXPORT_SYMBOL(__page_frag_drain);
+
void *__alloc_page_frag(struct page_frag_cache *nc,
unsigned int fragsz, gfp_t gfp_mask)
{
This reverts commit f9d40f6a9921 ("igb: Revert support for build_skb in
igb") and adds a few changes to update it to work with the latest version
of igb. We are now able to revert the removal of this due to the fact
that with the recent changes to the page count and the use of
DMA_ATTR_SKIP_CPU_SYNC we can make the pages writable so we should not be
invalidating the additional data added when we call build_skb.
The biggest risk with this change is that we are now not able to support
full jumbo frames when using build_skb. Instead we can only support up to
2K minus the skb overhead and padding offset.
Signed-off-by: Alexander Duyck <[email protected]>
---
drivers/net/ethernet/intel/igb/igb.h | 29 ++++++
drivers/net/ethernet/intel/igb/igb_main.c | 130 ++++++++++++++++++++++++++---
2 files changed, 142 insertions(+), 17 deletions(-)
diff --git a/drivers/net/ethernet/intel/igb/igb.h b/drivers/net/ethernet/intel/igb/igb.h
index acbc3ab..c3420f3 100644
--- a/drivers/net/ethernet/intel/igb/igb.h
+++ b/drivers/net/ethernet/intel/igb/igb.h
@@ -145,6 +145,10 @@ struct vf_data_storage {
#define IGB_RX_HDR_LEN IGB_RXBUFFER_256
#define IGB_RX_BUFSZ IGB_RXBUFFER_2048
+#define IGB_SKB_PAD (NET_SKB_PAD + NET_IP_ALIGN)
+#define IGB_MAX_BUILD_SKB_SIZE \
+ (SKB_WITH_OVERHEAD(IGB_RX_BUFSZ) - (IGB_SKB_PAD + IGB_TS_HDR_LEN))
+
/* How many Rx Buffers do we bundle into one write to the hardware ? */
#define IGB_RX_BUFFER_WRITE 16 /* Must be power of 2 */
@@ -301,12 +305,29 @@ struct igb_q_vector {
};
enum e1000_ring_flags_t {
- IGB_RING_FLAG_RX_SCTP_CSUM,
- IGB_RING_FLAG_RX_LB_VLAN_BSWAP,
- IGB_RING_FLAG_TX_CTX_IDX,
- IGB_RING_FLAG_TX_DETECT_HANG
+ IGB_RING_FLAG_RX_SCTP_CSUM = 0,
+#if (NET_IP_ALIGN != 0)
+ IGB_RING_FLAG_RX_BUILD_SKB_ENABLED = 1,
+#endif
+ IGB_RING_FLAG_RX_LB_VLAN_BSWAP = 2,
+ IGB_RING_FLAG_TX_CTX_IDX = 3,
+ IGB_RING_FLAG_TX_DETECT_HANG = 4,
+#if (NET_IP_ALIGN == 0)
+#if (L1_CACHE_SHIFT < 5)
+ IGB_RING_FLAG_RX_BUILD_SKB_ENABLED = 5,
+#else
+ IGB_RING_FLAG_RX_BUILD_SKB_ENABLED = L1_CACHE_SHIFT,
+#endif
+#endif
};
+#define ring_uses_build_skb(ring) \
+ test_bit(IGB_RING_FLAG_RX_BUILD_SKB_ENABLED, &(ring)->flags)
+#define set_ring_build_skb_enabled(ring) \
+ set_bit(IGB_RING_FLAG_RX_BUILD_SKB_ENABLED, &(ring)->flags)
+#define clear_ring_build_skb_enabled(ring) \
+ clear_bit(IGB_RING_FLAG_RX_BUILD_SKB_ENABLED, &(ring)->flags)
+
#define IGB_TXD_DCMD (E1000_ADVTXD_DCMD_EOP | E1000_ADVTXD_DCMD_RS)
#define IGB_RX_DESC(R, i) \
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index 83fdef6..7674a50 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -3761,6 +3761,16 @@ void igb_configure_rx_ring(struct igb_adapter *adapter,
wr32(E1000_RXDCTL(reg_idx), rxdctl);
}
+static void igb_set_rx_buffer_len(struct igb_adapter *adapter,
+ struct igb_ring *rx_ring)
+{
+ /* set build_skb flag */
+ if (adapter->max_frame_size <= IGB_MAX_BUILD_SKB_SIZE)
+ set_ring_build_skb_enabled(rx_ring);
+ else
+ clear_ring_build_skb_enabled(rx_ring);
+}
+
/**
* igb_configure_rx - Configure receive Unit after Reset
* @adapter: board private structure
@@ -3778,8 +3788,12 @@ static void igb_configure_rx(struct igb_adapter *adapter)
/* Setup the HW Rx Head and Tail Descriptor Pointers and
* the Base and Length of the Rx Descriptor Ring
*/
- for (i = 0; i < adapter->num_rx_queues; i++)
- igb_configure_rx_ring(adapter, adapter->rx_ring[i]);
+ for (i = 0; i < adapter->num_rx_queues; i++) {
+ struct igb_ring *rx_ring = adapter->rx_ring[i];
+
+ igb_set_rx_buffer_len(adapter, rx_ring);
+ igb_configure_rx_ring(adapter, rx_ring);
+ }
}
/**
@@ -4238,7 +4252,7 @@ static void igb_set_rx_mode(struct net_device *netdev)
struct igb_adapter *adapter = netdev_priv(netdev);
struct e1000_hw *hw = &adapter->hw;
unsigned int vfn = adapter->vfs_allocated_count;
- u32 rctl = 0, vmolr = 0;
+ u32 rctl = 0, vmolr = 0, rlpml = MAX_JUMBO_FRAME_SIZE;
int count;
/* Check for Promiscuous and All Multicast modes */
@@ -4310,12 +4324,18 @@ static void igb_set_rx_mode(struct net_device *netdev)
vmolr |= rd32(E1000_VMOLR(vfn)) &
~(E1000_VMOLR_ROPE | E1000_VMOLR_MPME | E1000_VMOLR_ROMPE);
- /* enable Rx jumbo frames, no need for restriction */
+ /* enable Rx jumbo frames, restrict as needed to support build_skb */
vmolr &= ~E1000_VMOLR_RLPML_MASK;
- vmolr |= MAX_JUMBO_FRAME_SIZE | E1000_VMOLR_LPE;
+ vmolr |= E1000_VMOLR_LPE;
+ vmolr |= (adapter->max_frame_size <= IGB_MAX_BUILD_SKB_SIZE) ?
+ IGB_MAX_BUILD_SKB_SIZE : MAX_JUMBO_FRAME_SIZE;
+
+ if (!adapter->vfs_allocated_count &&
+ (adapter->max_frame_size <= IGB_MAX_BUILD_SKB_SIZE))
+ rlpml = IGB_MAX_BUILD_SKB_SIZE;
wr32(E1000_VMOLR(vfn), vmolr);
- wr32(E1000_RLPML, MAX_JUMBO_FRAME_SIZE);
+ wr32(E1000_RLPML, rlpml);
igb_restore_vf_multicasts(adapter);
}
@@ -5046,9 +5066,9 @@ static void igb_tx_csum(struct igb_ring *tx_ring, struct igb_tx_buffer *first)
}
#define IGB_SET_FLAG(_input, _flag, _result) \
- ((_flag <= _result) ? \
- ((u32)(_input & _flag) * (_result / _flag)) : \
- ((u32)(_input & _flag) / (_flag / _result)))
+ (((_flag) <= (_result)) ? \
+ ((u32)(_input & (_flag)) * ((_result) / (_flag))) : \
+ ((u32)(_input & (_flag)) / ((_flag) / (_result))))
static u32 igb_tx_cmd_type(struct sk_buff *skb, u32 tx_flags)
{
@@ -6829,7 +6849,7 @@ static inline bool igb_page_is_reserved(struct page *page)
static bool igb_can_reuse_rx_page(struct igb_rx_buffer *rx_buffer,
struct page *page,
- unsigned int truesize)
+ const unsigned int truesize)
{
unsigned int pagecnt_bias = rx_buffer->pagecnt_bias--;
@@ -6888,7 +6908,7 @@ static bool igb_add_rx_frag(struct igb_ring *rx_ring,
struct page *page = rx_buffer->page;
unsigned char *va = page_address(page) + rx_buffer->page_offset;
#if (PAGE_SIZE < 8192)
- unsigned int truesize = IGB_RX_BUFSZ;
+ const unsigned int truesize = IGB_RX_BUFSZ;
#else
unsigned int truesize = SKB_DATA_ALIGN(size);
#endif
@@ -6933,6 +6953,78 @@ static bool igb_add_rx_frag(struct igb_ring *rx_ring,
return igb_can_reuse_rx_page(rx_buffer, page, truesize);
}
+static struct sk_buff *igb_build_rx_buffer(struct igb_ring *rx_ring,
+ union e1000_adv_rx_desc *rx_desc)
+{
+ unsigned int size = le16_to_cpu(rx_desc->wb.upper.length);
+ struct igb_rx_buffer *rx_buffer;
+ struct sk_buff *skb;
+ struct page *page;
+ void *va;
+#if (PAGE_SIZE < 8192)
+ const unsigned int truesize = IGB_RX_BUFSZ;
+#else
+ unsigned int truesize = SKB_DATA_ALIGN(sizeof(struct skb_shared_info)) +
+ SKB_DATA_ALIGN(NET_SKB_PAD +
+ NET_IP_ALIGN +
+ size);
+#endif
+
+ rx_buffer = &rx_ring->rx_buffer_info[rx_ring->next_to_clean];
+ page = rx_buffer->page;
+ prefetchw(page);
+
+ /* we are reusing so sync this buffer for CPU use */
+ dma_sync_single_range_for_cpu(rx_ring->dev,
+ rx_buffer->dma,
+ rx_buffer->page_offset + IGB_SKB_PAD,
+ size,
+ DMA_FROM_DEVICE);
+
+ va = page_address(page) + rx_buffer->page_offset;
+
+ /* prefetch first cache line of first page */
+ prefetch(va + IGB_SKB_PAD);
+#if L1_CACHE_BYTES < 128
+ prefetch(va + L1_CACHE_BYTES + IGB_SKB_PAD);
+#endif
+
+ /* build an skb to around the page buffer */
+ skb = build_skb(va, truesize);
+ if (unlikely(!skb)) {
+ rx_ring->rx_stats.alloc_failed++;
+ return NULL;
+ }
+
+ /* update pointers within the skb to store the data */
+ skb_reserve(skb, IGB_SKB_PAD);
+ __skb_put(skb, size);
+
+ /* pull timestamp out of packet data */
+ if (igb_test_staterr(rx_desc, E1000_RXDADV_STAT_TSIP)) {
+ igb_ptp_rx_pktstamp(rx_ring->q_vector, skb->data, skb);
+ __skb_pull(skb, IGB_TS_HDR_LEN);
+ }
+
+ if (igb_can_reuse_rx_page(rx_buffer, page, truesize)) {
+ /* hand second half of page back to the ring */
+ igb_reuse_rx_page(rx_ring, rx_buffer);
+ } else {
+ /* We are not reusing the buffer so unmap it and free
+ * any references we are holding to it
+ */
+ dma_unmap_page_attrs(rx_ring->dev, rx_buffer->dma,
+ PAGE_SIZE, DMA_FROM_DEVICE,
+ DMA_ATTR_SKIP_CPU_SYNC);
+ __page_frag_drain(page, 0, rx_buffer->pagecnt_bias);
+ }
+
+ /* clear contents of rx_buffer */
+ rx_buffer->page = NULL;
+
+ return skb;
+}
+
static struct sk_buff *igb_fetch_rx_buffer(struct igb_ring *rx_ring,
union e1000_adv_rx_desc *rx_desc,
struct sk_buff *skb)
@@ -7178,7 +7270,10 @@ static int igb_clean_rx_irq(struct igb_q_vector *q_vector, const int budget)
dma_rmb();
/* retrieve a buffer from the ring */
- skb = igb_fetch_rx_buffer(rx_ring, rx_desc, skb);
+ if (ring_uses_build_skb(rx_ring))
+ skb = igb_build_rx_buffer(rx_ring, rx_desc);
+ else
+ skb = igb_fetch_rx_buffer(rx_ring, rx_desc, skb);
/* exit if we failed to retrieve a buffer */
if (!skb)
@@ -7266,6 +7361,13 @@ static bool igb_alloc_mapped_page(struct igb_ring *rx_ring,
return true;
}
+static inline unsigned int igb_rx_offset(struct igb_ring *rx_ring)
+{
+ return IGB_SET_FLAG(rx_ring->flags,
+ 1 << IGB_RING_FLAG_RX_BUILD_SKB_ENABLED,
+ IGB_SKB_PAD);
+}
+
/**
* igb_alloc_rx_buffers - Replace used receive buffers; packet split
* @adapter: address of board private structure
@@ -7297,7 +7399,9 @@ void igb_alloc_rx_buffers(struct igb_ring *rx_ring, u16 cleaned_count)
/* Refresh the desc even if buffer_addrs didn't change
* because each write-back erases this info.
*/
- rx_desc->read.pkt_addr = cpu_to_le64(bi->dma + bi->page_offset);
+ rx_desc->read.pkt_addr = cpu_to_le64(bi->dma +
+ bi->page_offset +
+ igb_rx_offset(rx_ring));
rx_desc++;
bi++;
The ARM architecture provides a mechanism for deferring cache line
invalidation in the case of map/unmap. This patch makes use of this
mechanism to avoid unnecessary synchronization.
A secondary effect of this change is that the portion of the page that has
been synchronized for use by the CPU should be writable and could be passed
up the stack (at least on ARM).
The last bit that occurred to me is that on architectures where the
sync_for_cpu call invalidates cache lines we were prefetching and then
invalidating the first 128 bytes of the packet. To avoid that I have moved
the sync up to before we perform the prefetch and allocate the skbuff so
that we can actually make use of it.
Signed-off-by: Alexander Duyck <[email protected]>
---
drivers/net/ethernet/intel/igb/igb_main.c | 53 ++++++++++++++++++-----------
1 file changed, 33 insertions(+), 20 deletions(-)
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index 4feca69..c8c458c 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -3947,10 +3947,21 @@ static void igb_clean_rx_ring(struct igb_ring *rx_ring)
if (!buffer_info->page)
continue;
- dma_unmap_page(rx_ring->dev,
- buffer_info->dma,
- PAGE_SIZE,
- DMA_FROM_DEVICE);
+ /* Invalidate cache lines that may have been written to by
+ * device so that we avoid corrupting memory.
+ */
+ dma_sync_single_range_for_cpu(rx_ring->dev,
+ buffer_info->dma,
+ buffer_info->page_offset,
+ IGB_RX_BUFSZ,
+ DMA_FROM_DEVICE);
+
+ /* free resources associated with mapping */
+ dma_unmap_page_attrs(rx_ring->dev,
+ buffer_info->dma,
+ PAGE_SIZE,
+ DMA_FROM_DEVICE,
+ DMA_ATTR_SKIP_CPU_SYNC);
__free_page(buffer_info->page);
buffer_info->page = NULL;
@@ -6808,12 +6819,6 @@ static void igb_reuse_rx_page(struct igb_ring *rx_ring,
/* transfer page from old buffer to new buffer */
*new_buff = *old_buff;
-
- /* sync the buffer for use by the device */
- dma_sync_single_range_for_device(rx_ring->dev, old_buff->dma,
- old_buff->page_offset,
- IGB_RX_BUFSZ,
- DMA_FROM_DEVICE);
}
static inline bool igb_page_is_reserved(struct page *page)
@@ -6934,6 +6939,13 @@ static struct sk_buff *igb_fetch_rx_buffer(struct igb_ring *rx_ring,
page = rx_buffer->page;
prefetchw(page);
+ /* we are reusing so sync this buffer for CPU use */
+ dma_sync_single_range_for_cpu(rx_ring->dev,
+ rx_buffer->dma,
+ rx_buffer->page_offset,
+ size,
+ DMA_FROM_DEVICE);
+
if (likely(!skb)) {
void *page_addr = page_address(page) +
rx_buffer->page_offset;
@@ -6958,21 +6970,15 @@ static struct sk_buff *igb_fetch_rx_buffer(struct igb_ring *rx_ring,
prefetchw(skb->data);
}
- /* we are reusing so sync this buffer for CPU use */
- dma_sync_single_range_for_cpu(rx_ring->dev,
- rx_buffer->dma,
- rx_buffer->page_offset,
- size,
- DMA_FROM_DEVICE);
-
/* pull page into skb */
if (igb_add_rx_frag(rx_ring, rx_buffer, size, rx_desc, skb)) {
/* hand second half of page back to the ring */
igb_reuse_rx_page(rx_ring, rx_buffer);
} else {
/* we are not reusing the buffer so unmap it */
- dma_unmap_page(rx_ring->dev, rx_buffer->dma,
- PAGE_SIZE, DMA_FROM_DEVICE);
+ dma_unmap_page_attrs(rx_ring->dev, rx_buffer->dma,
+ PAGE_SIZE, DMA_FROM_DEVICE,
+ DMA_ATTR_SKIP_CPU_SYNC);
}
/* clear contents of rx_buffer */
@@ -7230,7 +7236,8 @@ static bool igb_alloc_mapped_page(struct igb_ring *rx_ring,
}
/* map page for use */
- dma = dma_map_page(rx_ring->dev, page, 0, PAGE_SIZE, DMA_FROM_DEVICE);
+ dma = dma_map_page_attrs(rx_ring->dev, page, 0, PAGE_SIZE,
+ DMA_FROM_DEVICE, DMA_ATTR_SKIP_CPU_SYNC);
/* if mapping failed free memory back to system since
* there isn't much point in holding memory we can't use
@@ -7271,6 +7278,12 @@ void igb_alloc_rx_buffers(struct igb_ring *rx_ring, u16 cleaned_count)
if (!igb_alloc_mapped_page(rx_ring, bi))
break;
+ /* sync the buffer for use by the device */
+ dma_sync_single_range_for_device(rx_ring->dev, bi->dma,
+ bi->page_offset,
+ IGB_RX_BUFSZ,
+ DMA_FROM_DEVICE);
+
/* Refresh the desc even if buffer_addrs didn't change
* because each write-back erases this info.
*/
This patch updates the driver code so that we do bulk updates of the page
reference count instead of just incrementing it by one reference at a time.
The advantage to doing this is that we cut down on atomic operations and
this in turn should give us a slight improvement in cycles per packet. In
addition if we eventually move this over to using build_skb the gains will
be more noticeable.
Signed-off-by: Alexander Duyck <[email protected]>
---
drivers/net/ethernet/intel/igb/igb.h | 7 ++++++-
drivers/net/ethernet/intel/igb/igb_main.c | 24 +++++++++++++++++-------
2 files changed, 23 insertions(+), 8 deletions(-)
diff --git a/drivers/net/ethernet/intel/igb/igb.h b/drivers/net/ethernet/intel/igb/igb.h
index d11093d..acbc3ab 100644
--- a/drivers/net/ethernet/intel/igb/igb.h
+++ b/drivers/net/ethernet/intel/igb/igb.h
@@ -210,7 +210,12 @@ struct igb_tx_buffer {
struct igb_rx_buffer {
dma_addr_t dma;
struct page *page;
- unsigned int page_offset;
+#if (BITS_PER_LONG > 32) || (PAGE_SIZE >= 65536)
+ __u32 page_offset;
+#else
+ __u16 page_offset;
+#endif
+ __u16 pagecnt_bias;
};
struct igb_tx_queue_stats {
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index c8c458c..83fdef6 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -3962,7 +3962,8 @@ static void igb_clean_rx_ring(struct igb_ring *rx_ring)
PAGE_SIZE,
DMA_FROM_DEVICE,
DMA_ATTR_SKIP_CPU_SYNC);
- __free_page(buffer_info->page);
+ __page_frag_drain(buffer_info->page, 0,
+ buffer_info->pagecnt_bias);
buffer_info->page = NULL;
}
@@ -6830,13 +6831,15 @@ static bool igb_can_reuse_rx_page(struct igb_rx_buffer *rx_buffer,
struct page *page,
unsigned int truesize)
{
+ unsigned int pagecnt_bias = rx_buffer->pagecnt_bias--;
+
/* avoid re-using remote pages */
if (unlikely(igb_page_is_reserved(page)))
return false;
#if (PAGE_SIZE < 8192)
/* if we are only owner of page we can reuse it */
- if (unlikely(page_count(page) != 1))
+ if (unlikely(page_ref_count(page) != pagecnt_bias))
return false;
/* flip page offset to other buffer */
@@ -6849,10 +6852,14 @@ static bool igb_can_reuse_rx_page(struct igb_rx_buffer *rx_buffer,
return false;
#endif
- /* Even if we own the page, we are not allowed to use atomic_set()
- * This would break get_page_unless_zero() users.
+ /* If we have drained the page fragment pool we need to update
+ * the pagecnt_bias and page count so that we fully restock the
+ * number of references the driver holds.
*/
- page_ref_inc(page);
+ if (unlikely(!rx_buffer->pagecnt_bias)) {
+ page_ref_add(page, USHRT_MAX);
+ rx_buffer->pagecnt_bias = USHRT_MAX;
+ }
return true;
}
@@ -6904,7 +6911,6 @@ static bool igb_add_rx_frag(struct igb_ring *rx_ring,
return true;
/* this page cannot be reused so discard it */
- __free_page(page);
return false;
}
@@ -6975,10 +6981,13 @@ static struct sk_buff *igb_fetch_rx_buffer(struct igb_ring *rx_ring,
/* hand second half of page back to the ring */
igb_reuse_rx_page(rx_ring, rx_buffer);
} else {
- /* we are not reusing the buffer so unmap it */
+ /* We are not reusing the buffer so unmap it and free
+ * any references we are holding to it
+ */
dma_unmap_page_attrs(rx_ring->dev, rx_buffer->dma,
PAGE_SIZE, DMA_FROM_DEVICE,
DMA_ATTR_SKIP_CPU_SYNC);
+ __page_frag_drain(page, 0, rx_buffer->pagecnt_bias);
}
/* clear contents of rx_buffer */
@@ -7252,6 +7261,7 @@ static bool igb_alloc_mapped_page(struct igb_ring *rx_ring,
bi->dma = dma;
bi->page = page;
bi->page_offset = 0;
+ bi->pagecnt_bias = 1;
return true;
}
This change allows us to pass DMA_ATTR_SKIP_CPU_SYNC which allows us to
avoid invoking cache line invalidation if the driver will just handle it
via a sync_for_cpu or sync_for_device call.
Cc: Max Filippov <[email protected]>
Signed-off-by: Alexander Duyck <[email protected]>
---
arch/xtensa/kernel/pci-dma.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/arch/xtensa/kernel/pci-dma.c b/arch/xtensa/kernel/pci-dma.c
index 1e68806..6a16dec 100644
--- a/arch/xtensa/kernel/pci-dma.c
+++ b/arch/xtensa/kernel/pci-dma.c
@@ -189,7 +189,9 @@ static dma_addr_t xtensa_map_page(struct device *dev, struct page *page,
{
dma_addr_t dma_handle = page_to_phys(page) + offset;
- xtensa_sync_single_for_device(dev, dma_handle, size, dir);
+ if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
+ xtensa_sync_single_for_device(dev, dma_handle, size, dir);
+
return dma_handle;
}
@@ -197,7 +199,8 @@ static void xtensa_unmap_page(struct device *dev, dma_addr_t dma_handle,
size_t size, enum dma_data_direction dir,
unsigned long attrs)
{
- xtensa_sync_single_for_cpu(dev, dma_handle, size, dir);
+ if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
+ xtensa_sync_single_for_cpu(dev, dma_handle, size, dir);
}
static int xtensa_map_sg(struct device *dev, struct scatterlist *sg,
On Mon, Oct 24, 2016 at 08:04:37AM -0400, Alexander Duyck wrote:
> As a first step to making DMA_ATTR_SKIP_CPU_SYNC apply to architectures
> beyond just ARM I need to make it so that the swiotlb will respect the
> flag. In order to do that I also need to update the swiotlb-xen since it
> heavily makes use of the functionality.
>
> Cc: Konrad Rzeszutek Wilk <[email protected]>
> Signed-off-by: Alexander Duyck <[email protected]>
> ---
> drivers/xen/swiotlb-xen.c | 40 ++++++++++++++++++++++----------------
> include/linux/swiotlb.h | 6 ++++--
> lib/swiotlb.c | 48 +++++++++++++++++++++++++++------------------
> 3 files changed, 56 insertions(+), 38 deletions(-)
>
> diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
> index 87e6035..cf047d8 100644
> --- a/drivers/xen/swiotlb-xen.c
> +++ b/drivers/xen/swiotlb-xen.c
> @@ -405,7 +405,8 @@ dma_addr_t xen_swiotlb_map_page(struct device *dev, struct page *page,
> */
> trace_swiotlb_bounced(dev, dev_addr, size, swiotlb_force);
>
> - map = swiotlb_tbl_map_single(dev, start_dma_addr, phys, size, dir);
> + map = swiotlb_tbl_map_single(dev, start_dma_addr, phys, size, dir,
> + attrs);
> if (map == SWIOTLB_MAP_ERROR)
> return DMA_ERROR_CODE;
>
> @@ -416,11 +417,13 @@ dma_addr_t xen_swiotlb_map_page(struct device *dev, struct page *page,
> /*
> * Ensure that the address returned is DMA'ble
> */
> - if (!dma_capable(dev, dev_addr, size)) {
> - swiotlb_tbl_unmap_single(dev, map, size, dir);
> - dev_addr = 0;
> - }
> - return dev_addr;
> + if (dma_capable(dev, dev_addr, size))
> + return dev_addr;
> +
> + swiotlb_tbl_unmap_single(dev, map, size, dir,
> + attrs | DMA_ATTR_SKIP_CPU_SYNC);
> +
> + return DMA_ERROR_CODE;
Why? This change (re-ordering the code - and returning DMA_ERROR_CODE instead
of 0) does not have anything to do with the title.
If you really feel strongly about it - then please send it as a seperate patch.
> }
> EXPORT_SYMBOL_GPL(xen_swiotlb_map_page);
>
> @@ -444,7 +447,7 @@ static void xen_unmap_single(struct device *hwdev, dma_addr_t dev_addr,
>
> /* NOTE: We use dev_addr here, not paddr! */
> if (is_xen_swiotlb_buffer(dev_addr)) {
> - swiotlb_tbl_unmap_single(hwdev, paddr, size, dir);
> + swiotlb_tbl_unmap_single(hwdev, paddr, size, dir, attrs);
> return;
> }
>
> @@ -557,16 +560,9 @@ void xen_swiotlb_unmap_page(struct device *hwdev, dma_addr_t dev_addr,
> start_dma_addr,
> sg_phys(sg),
> sg->length,
> - dir);
> - if (map == SWIOTLB_MAP_ERROR) {
> - dev_warn(hwdev, "swiotlb buffer is full\n");
> - /* Don't panic here, we expect map_sg users
> - to do proper error handling. */
> - xen_swiotlb_unmap_sg_attrs(hwdev, sgl, i, dir,
> - attrs);
> - sg_dma_len(sgl) = 0;
> - return 0;
> - }
> + dir, attrs);
> + if (map == SWIOTLB_MAP_ERROR)
> + goto map_error;
> xen_dma_map_page(hwdev, pfn_to_page(map >> PAGE_SHIFT),
> dev_addr,
> map & ~PAGE_MASK,
> @@ -589,6 +585,16 @@ void xen_swiotlb_unmap_page(struct device *hwdev, dma_addr_t dev_addr,
> sg_dma_len(sg) = sg->length;
> }
> return nelems;
> +map_error:
> + dev_warn(hwdev, "swiotlb buffer is full\n");
> + /*
> + * Don't panic here, we expect map_sg users
> + * to do proper error handling.
> + */
> + xen_swiotlb_unmap_sg_attrs(hwdev, sgl, i, dir,
> + attrs | DMA_ATTR_SKIP_CPU_SYNC);
> + sg_dma_len(sgl) = 0;
> + return 0;
> }
This too. Why can't that be part of the existing code that was there?
This change allows us to pass DMA_ATTR_SKIP_CPU_SYNC which allows us to
avoid invoking cache line invalidation if the driver will just handle it
via a sync_for_cpu or sync_for_device call.
Cc: Jonas Bonn <[email protected]>
Signed-off-by: Alexander Duyck <[email protected]>
---
arch/openrisc/kernel/dma.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/arch/openrisc/kernel/dma.c b/arch/openrisc/kernel/dma.c
index 140c991..906998b 100644
--- a/arch/openrisc/kernel/dma.c
+++ b/arch/openrisc/kernel/dma.c
@@ -141,6 +141,9 @@
unsigned long cl;
dma_addr_t addr = page_to_phys(page) + offset;
+ if (attrs & DMA_ATTR_SKIP_CPU_SYNC)
+ return addr;
+
switch (dir) {
case DMA_TO_DEVICE:
/* Flush the dcache for the requested range */
On Mon, Oct 24, 2016 at 08:04:31AM -0400, Alexander Duyck wrote:
> There are no users for swiotlb_map_sg so we might as well just drop it.
>
> Cc: Konrad Rzeszutek Wilk <[email protected]>
Acked-by: Konrad Rzeszutek Wilk <[email protected]>
Thought I swear I saw a familiar patch by Christopher Hellwig at some point..
but maybe that patchset had been dropped.
> Signed-off-by: Alexander Duyck <[email protected]>
> ---
> include/linux/swiotlb.h | 4 ----
> lib/swiotlb.c | 8 --------
> 2 files changed, 12 deletions(-)
>
> diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
> index 5f81f8a..e237b6f 100644
> --- a/include/linux/swiotlb.h
> +++ b/include/linux/swiotlb.h
> @@ -72,10 +72,6 @@ extern void swiotlb_unmap_page(struct device *hwdev, dma_addr_t dev_addr,
> size_t size, enum dma_data_direction dir,
> unsigned long attrs);
>
> -extern int
> -swiotlb_map_sg(struct device *hwdev, struct scatterlist *sg, int nents,
> - enum dma_data_direction dir);
> -
> extern void
> swiotlb_unmap_sg(struct device *hwdev, struct scatterlist *sg, int nents,
> enum dma_data_direction dir);
> diff --git a/lib/swiotlb.c b/lib/swiotlb.c
> index 22e13a0..47aad37 100644
> --- a/lib/swiotlb.c
> +++ b/lib/swiotlb.c
> @@ -910,14 +910,6 @@ void swiotlb_unmap_page(struct device *hwdev, dma_addr_t dev_addr,
> }
> EXPORT_SYMBOL(swiotlb_map_sg_attrs);
>
> -int
> -swiotlb_map_sg(struct device *hwdev, struct scatterlist *sgl, int nelems,
> - enum dma_data_direction dir)
> -{
> - return swiotlb_map_sg_attrs(hwdev, sgl, nelems, dir, 0);
> -}
> -EXPORT_SYMBOL(swiotlb_map_sg);
> -
> /*
> * Unmap a set of streaming mode DMA translations. Again, cpu read rules
> * concerning calls here are the same as for swiotlb_unmap_page() above.
>
The use of DMA_ATTR_SKIP_CPU_SYNC was not consistent across all of the DMA
APIs in the arch/arm folder. This change is meant to correct that so that
we get consistent behavior.
Signed-off-by: Alexander Duyck <[email protected]>
---
arch/frv/mb93090-mb00/pci-dma-nommu.c | 16 +++++++++++-----
arch/frv/mb93090-mb00/pci-dma.c | 7 ++++++-
2 files changed, 17 insertions(+), 6 deletions(-)
diff --git a/arch/frv/mb93090-mb00/pci-dma-nommu.c b/arch/frv/mb93090-mb00/pci-dma-nommu.c
index 90f2e4c..ff606d1 100644
--- a/arch/frv/mb93090-mb00/pci-dma-nommu.c
+++ b/arch/frv/mb93090-mb00/pci-dma-nommu.c
@@ -109,16 +109,19 @@ static int frv_dma_map_sg(struct device *dev, struct scatterlist *sglist,
int nents, enum dma_data_direction direction,
unsigned long attrs)
{
- int i;
struct scatterlist *sg;
+ int i;
+
+ WARN_ON(direction == DMA_NONE);
+
+ if (attrs & DMA_ATTR_SKIP_CPU_SYNC)
+ return nents;
for_each_sg(sglist, sg, nents, i) {
frv_cache_wback_inv(sg_dma_address(sg),
sg_dma_address(sg) + sg_dma_len(sg));
}
- BUG_ON(direction == DMA_NONE);
-
return nents;
}
@@ -126,8 +129,11 @@ static dma_addr_t frv_dma_map_page(struct device *dev, struct page *page,
unsigned long offset, size_t size,
enum dma_data_direction direction, unsigned long attrs)
{
- BUG_ON(direction == DMA_NONE);
- flush_dcache_page(page);
+ WARN_ON(direction == DMA_NONE);
+
+ if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
+ flush_dcache_page(page);
+
return (dma_addr_t) page_to_phys(page) + offset;
}
diff --git a/arch/frv/mb93090-mb00/pci-dma.c b/arch/frv/mb93090-mb00/pci-dma.c
index f585745..ee5dadf 100644
--- a/arch/frv/mb93090-mb00/pci-dma.c
+++ b/arch/frv/mb93090-mb00/pci-dma.c
@@ -52,6 +52,9 @@ static int frv_dma_map_sg(struct device *dev, struct scatterlist *sglist,
for_each_sg(sglist, sg, nents, i) {
vaddr = kmap_atomic_primary(sg_page(sg));
+ if (attrs & DMA_ATTR_SKIP_CPU_SYNC)
+ continue;
+
frv_dcache_writeback((unsigned long) vaddr,
(unsigned long) vaddr + PAGE_SIZE);
@@ -70,7 +73,9 @@ static dma_addr_t frv_dma_map_page(struct device *dev, struct page *page,
unsigned long offset, size_t size,
enum dma_data_direction direction, unsigned long attrs)
{
- flush_dcache_page(page);
+ if (!(attr & DMA_ATTR_SKIP_CPU_SYNC))
+ flush_dcache_page(page);
+
return (dma_addr_t) page_to_phys(page) + offset;
}
This change allows us to pass DMA_ATTR_SKIP_CPU_SYNC which allows us to
avoid invoking cache line invalidation if the driver will just handle it
later via a sync_for_cpu or sync_for_device call.
Cc: Richard Kuo <[email protected]>
Cc: [email protected]
Signed-off-by: Alexander Duyck <[email protected]>
---
arch/hexagon/kernel/dma.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/arch/hexagon/kernel/dma.c b/arch/hexagon/kernel/dma.c
index b901778..dbc4f10 100644
--- a/arch/hexagon/kernel/dma.c
+++ b/arch/hexagon/kernel/dma.c
@@ -119,6 +119,9 @@ static int hexagon_map_sg(struct device *hwdev, struct scatterlist *sg,
s->dma_length = s->length;
+ if (attrs & DMA_ATTR_SKIP_CPU_SYNC)
+ continue;
+
flush_dcache_range(dma_addr_to_virt(s->dma_address),
dma_addr_to_virt(s->dma_address + s->length));
}
@@ -180,7 +183,8 @@ static dma_addr_t hexagon_map_page(struct device *dev, struct page *page,
if (!check_addr("map_single", dev, bus, size))
return bad_dma_address;
- dma_sync(dma_addr_to_virt(bus), size, dir);
+ if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
+ dma_sync(dma_addr_to_virt(bus), size, dir);
return bus;
}
This change allows us to pass DMA_ATTR_SKIP_CPU_SYNC which allows us to
avoid invoking cache line invalidation if the driver will just handle it
later via a sync_for_cpu or sync_for_device call.
Cc: Vineet Gupta <[email protected]>
Cc: [email protected]
Signed-off-by: Alexander Duyck <[email protected]>
---
arch/arc/mm/dma.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/arc/mm/dma.c b/arch/arc/mm/dma.c
index 20afc65..d0c4b28 100644
--- a/arch/arc/mm/dma.c
+++ b/arch/arc/mm/dma.c
@@ -133,7 +133,8 @@ static dma_addr_t arc_dma_map_page(struct device *dev, struct page *page,
unsigned long attrs)
{
phys_addr_t paddr = page_to_phys(page) + offset;
- _dma_cache_sync(paddr, size, dir);
+ if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
+ _dma_cache_sync(paddr, size, dir);
return plat_phys_to_dma(dev, paddr);
}
As a first step to making DMA_ATTR_SKIP_CPU_SYNC apply to architectures
beyond just ARM I need to make it so that the swiotlb will respect the
flag. In order to do that I also need to update the swiotlb-xen since it
heavily makes use of the functionality.
Cc: Konrad Rzeszutek Wilk <[email protected]>
Signed-off-by: Alexander Duyck <[email protected]>
---
drivers/xen/swiotlb-xen.c | 40 ++++++++++++++++++++++----------------
include/linux/swiotlb.h | 6 ++++--
lib/swiotlb.c | 48 +++++++++++++++++++++++++++------------------
3 files changed, 56 insertions(+), 38 deletions(-)
diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index 87e6035..cf047d8 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -405,7 +405,8 @@ dma_addr_t xen_swiotlb_map_page(struct device *dev, struct page *page,
*/
trace_swiotlb_bounced(dev, dev_addr, size, swiotlb_force);
- map = swiotlb_tbl_map_single(dev, start_dma_addr, phys, size, dir);
+ map = swiotlb_tbl_map_single(dev, start_dma_addr, phys, size, dir,
+ attrs);
if (map == SWIOTLB_MAP_ERROR)
return DMA_ERROR_CODE;
@@ -416,11 +417,13 @@ dma_addr_t xen_swiotlb_map_page(struct device *dev, struct page *page,
/*
* Ensure that the address returned is DMA'ble
*/
- if (!dma_capable(dev, dev_addr, size)) {
- swiotlb_tbl_unmap_single(dev, map, size, dir);
- dev_addr = 0;
- }
- return dev_addr;
+ if (dma_capable(dev, dev_addr, size))
+ return dev_addr;
+
+ swiotlb_tbl_unmap_single(dev, map, size, dir,
+ attrs | DMA_ATTR_SKIP_CPU_SYNC);
+
+ return DMA_ERROR_CODE;
}
EXPORT_SYMBOL_GPL(xen_swiotlb_map_page);
@@ -444,7 +447,7 @@ static void xen_unmap_single(struct device *hwdev, dma_addr_t dev_addr,
/* NOTE: We use dev_addr here, not paddr! */
if (is_xen_swiotlb_buffer(dev_addr)) {
- swiotlb_tbl_unmap_single(hwdev, paddr, size, dir);
+ swiotlb_tbl_unmap_single(hwdev, paddr, size, dir, attrs);
return;
}
@@ -557,16 +560,9 @@ void xen_swiotlb_unmap_page(struct device *hwdev, dma_addr_t dev_addr,
start_dma_addr,
sg_phys(sg),
sg->length,
- dir);
- if (map == SWIOTLB_MAP_ERROR) {
- dev_warn(hwdev, "swiotlb buffer is full\n");
- /* Don't panic here, we expect map_sg users
- to do proper error handling. */
- xen_swiotlb_unmap_sg_attrs(hwdev, sgl, i, dir,
- attrs);
- sg_dma_len(sgl) = 0;
- return 0;
- }
+ dir, attrs);
+ if (map == SWIOTLB_MAP_ERROR)
+ goto map_error;
xen_dma_map_page(hwdev, pfn_to_page(map >> PAGE_SHIFT),
dev_addr,
map & ~PAGE_MASK,
@@ -589,6 +585,16 @@ void xen_swiotlb_unmap_page(struct device *hwdev, dma_addr_t dev_addr,
sg_dma_len(sg) = sg->length;
}
return nelems;
+map_error:
+ dev_warn(hwdev, "swiotlb buffer is full\n");
+ /*
+ * Don't panic here, we expect map_sg users
+ * to do proper error handling.
+ */
+ xen_swiotlb_unmap_sg_attrs(hwdev, sgl, i, dir,
+ attrs | DMA_ATTR_SKIP_CPU_SYNC);
+ sg_dma_len(sgl) = 0;
+ return 0;
}
EXPORT_SYMBOL_GPL(xen_swiotlb_map_sg_attrs);
diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index e237b6f..4517be9 100644
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -44,11 +44,13 @@ enum dma_sync_target {
extern phys_addr_t swiotlb_tbl_map_single(struct device *hwdev,
dma_addr_t tbl_dma_addr,
phys_addr_t phys, size_t size,
- enum dma_data_direction dir);
+ enum dma_data_direction dir,
+ unsigned long attrs);
extern void swiotlb_tbl_unmap_single(struct device *hwdev,
phys_addr_t tlb_addr,
- size_t size, enum dma_data_direction dir);
+ size_t size, enum dma_data_direction dir,
+ unsigned long attrs);
extern void swiotlb_tbl_sync_single(struct device *hwdev,
phys_addr_t tlb_addr,
diff --git a/lib/swiotlb.c b/lib/swiotlb.c
index 47aad37..b538d39 100644
--- a/lib/swiotlb.c
+++ b/lib/swiotlb.c
@@ -425,7 +425,8 @@ static void swiotlb_bounce(phys_addr_t orig_addr, phys_addr_t tlb_addr,
phys_addr_t swiotlb_tbl_map_single(struct device *hwdev,
dma_addr_t tbl_dma_addr,
phys_addr_t orig_addr, size_t size,
- enum dma_data_direction dir)
+ enum dma_data_direction dir,
+ unsigned long attrs)
{
unsigned long flags;
phys_addr_t tlb_addr;
@@ -526,7 +527,8 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev,
*/
for (i = 0; i < nslots; i++)
io_tlb_orig_addr[index+i] = orig_addr + (i << IO_TLB_SHIFT);
- if (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL)
+ if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC) &&
+ (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL))
swiotlb_bounce(orig_addr, tlb_addr, size, DMA_TO_DEVICE);
return tlb_addr;
@@ -539,18 +541,20 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev,
static phys_addr_t
map_single(struct device *hwdev, phys_addr_t phys, size_t size,
- enum dma_data_direction dir)
+ enum dma_data_direction dir, unsigned long attrs)
{
dma_addr_t start_dma_addr = phys_to_dma(hwdev, io_tlb_start);
- return swiotlb_tbl_map_single(hwdev, start_dma_addr, phys, size, dir);
+ return swiotlb_tbl_map_single(hwdev, start_dma_addr, phys, size,
+ dir, attrs);
}
/*
* dma_addr is the kernel virtual address of the bounce buffer to unmap.
*/
void swiotlb_tbl_unmap_single(struct device *hwdev, phys_addr_t tlb_addr,
- size_t size, enum dma_data_direction dir)
+ size_t size, enum dma_data_direction dir,
+ unsigned long attrs)
{
unsigned long flags;
int i, count, nslots = ALIGN(size, 1 << IO_TLB_SHIFT) >> IO_TLB_SHIFT;
@@ -561,6 +565,7 @@ void swiotlb_tbl_unmap_single(struct device *hwdev, phys_addr_t tlb_addr,
* First, sync the memory before unmapping the entry
*/
if (orig_addr != INVALID_PHYS_ADDR &&
+ !(attrs & DMA_ATTR_SKIP_CPU_SYNC) &&
((dir == DMA_FROM_DEVICE) || (dir == DMA_BIDIRECTIONAL)))
swiotlb_bounce(orig_addr, tlb_addr, size, DMA_FROM_DEVICE);
@@ -654,7 +659,8 @@ void swiotlb_tbl_sync_single(struct device *hwdev, phys_addr_t tlb_addr,
* GFP_DMA memory; fall back on map_single(), which
* will grab memory from the lowest available address range.
*/
- phys_addr_t paddr = map_single(hwdev, 0, size, DMA_FROM_DEVICE);
+ phys_addr_t paddr = map_single(hwdev, 0, size,
+ DMA_FROM_DEVICE, 0);
if (paddr == SWIOTLB_MAP_ERROR)
goto err_warn;
@@ -669,7 +675,8 @@ void swiotlb_tbl_sync_single(struct device *hwdev, phys_addr_t tlb_addr,
/* DMA_TO_DEVICE to avoid memcpy in unmap_single */
swiotlb_tbl_unmap_single(hwdev, paddr,
- size, DMA_TO_DEVICE);
+ size, DMA_TO_DEVICE,
+ DMA_ATTR_SKIP_CPU_SYNC);
goto err_warn;
}
}
@@ -699,7 +706,7 @@ void swiotlb_tbl_sync_single(struct device *hwdev, phys_addr_t tlb_addr,
free_pages((unsigned long)vaddr, get_order(size));
else
/* DMA_TO_DEVICE to avoid memcpy in swiotlb_tbl_unmap_single */
- swiotlb_tbl_unmap_single(hwdev, paddr, size, DMA_TO_DEVICE);
+ swiotlb_tbl_unmap_single(hwdev, paddr, size, DMA_TO_DEVICE, 0);
}
EXPORT_SYMBOL(swiotlb_free_coherent);
@@ -755,7 +762,7 @@ dma_addr_t swiotlb_map_page(struct device *dev, struct page *page,
trace_swiotlb_bounced(dev, dev_addr, size, swiotlb_force);
/* Oh well, have to allocate and map a bounce buffer. */
- map = map_single(dev, phys, size, dir);
+ map = map_single(dev, phys, size, dir, attrs);
if (map == SWIOTLB_MAP_ERROR) {
swiotlb_full(dev, size, dir, 1);
return phys_to_dma(dev, io_tlb_overflow_buffer);
@@ -764,12 +771,13 @@ dma_addr_t swiotlb_map_page(struct device *dev, struct page *page,
dev_addr = phys_to_dma(dev, map);
/* Ensure that the address returned is DMA'ble */
- if (!dma_capable(dev, dev_addr, size)) {
- swiotlb_tbl_unmap_single(dev, map, size, dir);
- return phys_to_dma(dev, io_tlb_overflow_buffer);
- }
+ if (dma_capable(dev, dev_addr, size))
+ return dev_addr;
+
+ swiotlb_tbl_unmap_single(dev, map, size, dir,
+ attrs | DMA_ATTR_SKIP_CPU_SYNC);
- return dev_addr;
+ return phys_to_dma(dev, io_tlb_overflow_buffer);
}
EXPORT_SYMBOL_GPL(swiotlb_map_page);
@@ -782,14 +790,15 @@ dma_addr_t swiotlb_map_page(struct device *dev, struct page *page,
* whatever the device wrote there.
*/
static void unmap_single(struct device *hwdev, dma_addr_t dev_addr,
- size_t size, enum dma_data_direction dir)
+ size_t size, enum dma_data_direction dir,
+ unsigned long attrs)
{
phys_addr_t paddr = dma_to_phys(hwdev, dev_addr);
BUG_ON(dir == DMA_NONE);
if (is_swiotlb_buffer(paddr)) {
- swiotlb_tbl_unmap_single(hwdev, paddr, size, dir);
+ swiotlb_tbl_unmap_single(hwdev, paddr, size, dir, attrs);
return;
}
@@ -809,7 +818,7 @@ void swiotlb_unmap_page(struct device *hwdev, dma_addr_t dev_addr,
size_t size, enum dma_data_direction dir,
unsigned long attrs)
{
- unmap_single(hwdev, dev_addr, size, dir);
+ unmap_single(hwdev, dev_addr, size, dir, attrs);
}
EXPORT_SYMBOL_GPL(swiotlb_unmap_page);
@@ -891,7 +900,7 @@ void swiotlb_unmap_page(struct device *hwdev, dma_addr_t dev_addr,
if (swiotlb_force ||
!dma_capable(hwdev, dev_addr, sg->length)) {
phys_addr_t map = map_single(hwdev, sg_phys(sg),
- sg->length, dir);
+ sg->length, dir, attrs);
if (map == SWIOTLB_MAP_ERROR) {
/* Don't panic here, we expect map_sg users
to do proper error handling. */
@@ -925,7 +934,8 @@ void swiotlb_unmap_page(struct device *hwdev, dma_addr_t dev_addr,
BUG_ON(dir == DMA_NONE);
for_each_sg(sgl, sg, nelems, i)
- unmap_single(hwdev, sg->dma_address, sg_dma_len(sg), dir);
+ unmap_single(hwdev, sg->dma_address, sg_dma_len(sg), dir,
+ attrs);
}
EXPORT_SYMBOL(swiotlb_unmap_sg_attrs);
There are no users for swiotlb_map_sg so we might as well just drop it.
Cc: Konrad Rzeszutek Wilk <[email protected]>
Signed-off-by: Alexander Duyck <[email protected]>
---
include/linux/swiotlb.h | 4 ----
lib/swiotlb.c | 8 --------
2 files changed, 12 deletions(-)
diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index 5f81f8a..e237b6f 100644
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -72,10 +72,6 @@ extern void swiotlb_unmap_page(struct device *hwdev, dma_addr_t dev_addr,
size_t size, enum dma_data_direction dir,
unsigned long attrs);
-extern int
-swiotlb_map_sg(struct device *hwdev, struct scatterlist *sg, int nents,
- enum dma_data_direction dir);
-
extern void
swiotlb_unmap_sg(struct device *hwdev, struct scatterlist *sg, int nents,
enum dma_data_direction dir);
diff --git a/lib/swiotlb.c b/lib/swiotlb.c
index 22e13a0..47aad37 100644
--- a/lib/swiotlb.c
+++ b/lib/swiotlb.c
@@ -910,14 +910,6 @@ void swiotlb_unmap_page(struct device *hwdev, dma_addr_t dev_addr,
}
EXPORT_SYMBOL(swiotlb_map_sg_attrs);
-int
-swiotlb_map_sg(struct device *hwdev, struct scatterlist *sgl, int nelems,
- enum dma_data_direction dir)
-{
- return swiotlb_map_sg_attrs(hwdev, sgl, nelems, dir, 0);
-}
-EXPORT_SYMBOL(swiotlb_map_sg);
-
/*
* Unmap a set of streaming mode DMA translations. Again, cpu read rules
* concerning calls here are the same as for swiotlb_unmap_page() above.
From: Alexander Duyck <[email protected]>
Date: Mon, 24 Oct 2016 08:06:07 -0400
> This change allows us to pass DMA_ATTR_SKIP_CPU_SYNC which allows us to
> avoid invoking cache line invalidation if the driver will just handle it
> via a sync_for_cpu or sync_for_device call.
>
> Cc: "David S. Miller" <[email protected]>
> Cc: [email protected]
> Signed-off-by: Alexander Duyck <[email protected]>
This is fine for avoiding the flush for performance reasons, but the
chip isn't going to write anything back unless the device wrote into
the area.
Around Mon 24 Oct 2016 08:04:53 -0400 or thereabout, Alexander Duyck wrote:
> The use of DMA_ATTR_SKIP_CPU_SYNC was not consistent across all of the DMA
> APIs in the arch/arm folder. This change is meant to correct that so that
> we get consistent behavior.
Looks good (-:
> Cc: Haavard Skinnemoen <[email protected]>
> Cc: Hans-Christian Egtvedt <[email protected]>
> Signed-off-by: Alexander Duyck <[email protected]>
Acked-by: Hans-Christian Noren Egtvedt <[email protected]>
> ---
> arch/avr32/mm/dma-coherent.c | 7 ++++++-
> 1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/arch/avr32/mm/dma-coherent.c b/arch/avr32/mm/dma-coherent.c
> index 58610d0..54534e5 100644
> --- a/arch/avr32/mm/dma-coherent.c
> +++ b/arch/avr32/mm/dma-coherent.c
> @@ -146,7 +146,8 @@ static dma_addr_t avr32_dma_map_page(struct device *dev, struct page *page,
> {
> void *cpu_addr = page_address(page) + offset;
>
> - dma_cache_sync(dev, cpu_addr, size, direction);
> + if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
> + dma_cache_sync(dev, cpu_addr, size, direction);
> return virt_to_bus(cpu_addr);
> }
>
> @@ -162,6 +163,10 @@ static int avr32_dma_map_sg(struct device *dev, struct scatterlist *sglist,
>
> sg->dma_address = page_to_bus(sg_page(sg)) + sg->offset;
> virt = sg_virt(sg);
> +
> + if (attrs & DMA_ATTR_SKIP_CPU_SYNC)
> + continue;
> +
> dma_cache_sync(dev, virt, sg->length, direction);
> }
>
--
mvh
Hans-Christian Noren Egtvedt
On Mon, Oct 24, 2016 at 11:09 AM, Konrad Rzeszutek Wilk
<[email protected]> wrote:
> On Mon, Oct 24, 2016 at 08:04:37AM -0400, Alexander Duyck wrote:
>> As a first step to making DMA_ATTR_SKIP_CPU_SYNC apply to architectures
>> beyond just ARM I need to make it so that the swiotlb will respect the
>> flag. In order to do that I also need to update the swiotlb-xen since it
>> heavily makes use of the functionality.
>>
>> Cc: Konrad Rzeszutek Wilk <[email protected]>
>> Signed-off-by: Alexander Duyck <[email protected]>
>> ---
>> drivers/xen/swiotlb-xen.c | 40 ++++++++++++++++++++++----------------
>> include/linux/swiotlb.h | 6 ++++--
>> lib/swiotlb.c | 48 +++++++++++++++++++++++++++------------------
>> 3 files changed, 56 insertions(+), 38 deletions(-)
>>
>> diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
>> index 87e6035..cf047d8 100644
>> --- a/drivers/xen/swiotlb-xen.c
>> +++ b/drivers/xen/swiotlb-xen.c
>> @@ -405,7 +405,8 @@ dma_addr_t xen_swiotlb_map_page(struct device *dev, struct page *page,
>> */
>> trace_swiotlb_bounced(dev, dev_addr, size, swiotlb_force);
>>
>> - map = swiotlb_tbl_map_single(dev, start_dma_addr, phys, size, dir);
>> + map = swiotlb_tbl_map_single(dev, start_dma_addr, phys, size, dir,
>> + attrs);
>> if (map == SWIOTLB_MAP_ERROR)
>> return DMA_ERROR_CODE;
>>
>> @@ -416,11 +417,13 @@ dma_addr_t xen_swiotlb_map_page(struct device *dev, struct page *page,
>> /*
>> * Ensure that the address returned is DMA'ble
>> */
>> - if (!dma_capable(dev, dev_addr, size)) {
>> - swiotlb_tbl_unmap_single(dev, map, size, dir);
>> - dev_addr = 0;
>> - }
>> - return dev_addr;
>> + if (dma_capable(dev, dev_addr, size))
>> + return dev_addr;
>> +
>> + swiotlb_tbl_unmap_single(dev, map, size, dir,
>> + attrs | DMA_ATTR_SKIP_CPU_SYNC);
>> +
>> + return DMA_ERROR_CODE;
>
> Why? This change (re-ordering the code - and returning DMA_ERROR_CODE instead
> of 0) does not have anything to do with the title.
>
> If you really feel strongly about it - then please send it as a seperate patch.
Okay I can do that. This was mostly just to clean up the formatting
because I was over 80 characters when I added the attribute. Changing
the return value to DMA_ERROR_CODE from 0 was based on the fact that
earlier in the function that is the value you return if there is a
mapping error.
>> }
>> EXPORT_SYMBOL_GPL(xen_swiotlb_map_page);
>>
>> @@ -444,7 +447,7 @@ static void xen_unmap_single(struct device *hwdev, dma_addr_t dev_addr,
>>
>> /* NOTE: We use dev_addr here, not paddr! */
>> if (is_xen_swiotlb_buffer(dev_addr)) {
>> - swiotlb_tbl_unmap_single(hwdev, paddr, size, dir);
>> + swiotlb_tbl_unmap_single(hwdev, paddr, size, dir, attrs);
>> return;
>> }
>>
>> @@ -557,16 +560,9 @@ void xen_swiotlb_unmap_page(struct device *hwdev, dma_addr_t dev_addr,
>> start_dma_addr,
>> sg_phys(sg),
>> sg->length,
>> - dir);
>> - if (map == SWIOTLB_MAP_ERROR) {
>> - dev_warn(hwdev, "swiotlb buffer is full\n");
>> - /* Don't panic here, we expect map_sg users
>> - to do proper error handling. */
>> - xen_swiotlb_unmap_sg_attrs(hwdev, sgl, i, dir,
>> - attrs);
>> - sg_dma_len(sgl) = 0;
>> - return 0;
>> - }
>> + dir, attrs);
>> + if (map == SWIOTLB_MAP_ERROR)
>> + goto map_error;
>> xen_dma_map_page(hwdev, pfn_to_page(map >> PAGE_SHIFT),
>> dev_addr,
>> map & ~PAGE_MASK,
>> @@ -589,6 +585,16 @@ void xen_swiotlb_unmap_page(struct device *hwdev, dma_addr_t dev_addr,
>> sg_dma_len(sg) = sg->length;
>> }
>> return nelems;
>> +map_error:
>> + dev_warn(hwdev, "swiotlb buffer is full\n");
>> + /*
>> + * Don't panic here, we expect map_sg users
>> + * to do proper error handling.
>> + */
>> + xen_swiotlb_unmap_sg_attrs(hwdev, sgl, i, dir,
>> + attrs | DMA_ATTR_SKIP_CPU_SYNC);
>> + sg_dma_len(sgl) = 0;
>> + return 0;
>> }
>
> This too. Why can't that be part of the existing code that was there?
Once again it was a formatting thing. I was indented too far and
adding the attribute pushed me over 80 characters so I broke it out to
a label to avoid the problem.
- Alex
On Mon, Oct 24, 2016 at 11:27 AM, David Miller <[email protected]> wrote:
> From: Alexander Duyck <[email protected]>
> Date: Mon, 24 Oct 2016 08:06:07 -0400
>
>> This change allows us to pass DMA_ATTR_SKIP_CPU_SYNC which allows us to
>> avoid invoking cache line invalidation if the driver will just handle it
>> via a sync_for_cpu or sync_for_device call.
>>
>> Cc: "David S. Miller" <[email protected]>
>> Cc: [email protected]
>> Signed-off-by: Alexander Duyck <[email protected]>
>
> This is fine for avoiding the flush for performance reasons, but the
> chip isn't going to write anything back unless the device wrote into
> the area.
That is mostly what I am doing here. The original implementation was
mostly for performance. I am trying to take the attribute that was
already in place for ARM and apply it to all the other architectures.
So what will be happening now is that we call the map function with
this attribute set and then use the sync functions to map it to the
device and then pull the mapping later.
The idea is that if Jesper does his page pool stuff it would be
calling the map/unmap functions and then the drivers would be doing
the sync_for_cpu/sync_for_device. I want to make sure the map is
cheap and we will have to call sync_for_cpu from the drivers anyway
since there is no guarantee if we will have a new page or be reusing
an existing one.
- Alex
On Mon, 2016-10-24 at 08:05 -0400, Alexander Duyck wrote:
> This change allows us to pass DMA_ATTR_SKIP_CPU_SYNC which allows us to
> avoid invoking cache line invalidation if the driver will just handle it
> later via a sync_for_cpu or sync_for_device call.
>
> Cc: Mark Salter <[email protected]>
> Cc: Aurelien Jacquiot <[email protected]>
> Cc: [email protected]
> Signed-off-by: Alexander Duyck <[email protected]>
> ---
Acked-by: Mark Salter <[email protected]>
> arch/c6x/kernel/dma.c | 16 +++++++++++-----
> 1 file changed, 11 insertions(+), 5 deletions(-)
>
> diff --git a/arch/c6x/kernel/dma.c b/arch/c6x/kernel/dma.c
> index db4a6a3..d28df74 100644
> --- a/arch/c6x/kernel/dma.c
> +++ b/arch/c6x/kernel/dma.c
> @@ -42,14 +42,17 @@ static dma_addr_t c6x_dma_map_page(struct device *dev, struct page *page,
> {
> dma_addr_t handle = virt_to_phys(page_address(page) + offset);
>
> - c6x_dma_sync(handle, size, dir);
> + if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
> + c6x_dma_sync(handle, size, dir);
> +
> return handle;
> }
>
> static void c6x_dma_unmap_page(struct device *dev, dma_addr_t handle,
> size_t size, enum dma_data_direction dir, unsigned long attrs)
> {
> - c6x_dma_sync(handle, size, dir);
> + if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
> + c6x_dma_sync(handle, size, dir);
> }
>
> static int c6x_dma_map_sg(struct device *dev, struct scatterlist *sglist,
> @@ -60,7 +63,8 @@ static int c6x_dma_map_sg(struct device *dev, struct scatterlist *sglist,
>
> for_each_sg(sglist, sg, nents, i) {
> sg->dma_address = sg_phys(sg);
> - c6x_dma_sync(sg->dma_address, sg->length, dir);
> + if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
> + c6x_dma_sync(sg->dma_address, sg->length, dir);
> }
>
> return nents;
> @@ -72,8 +76,10 @@ static void c6x_dma_unmap_sg(struct device *dev, struct scatterlist *sglist,
> struct scatterlist *sg;
> int i;
>
> - for_each_sg(sglist, sg, nents, i)
> - c6x_dma_sync(sg_dma_address(sg), sg->length, dir);
> + for_each_sg(sglist, sg, nents, i) {
> + if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
> + c6x_dma_sync(sg_dma_address(sg), sg->length, dir);
> + }
>
> }
>
>
On Mon, Oct 24, 2016 at 08:04:47AM -0400, Alexander Duyck wrote:
> The use of DMA_ATTR_SKIP_CPU_SYNC was not consistent across all of the DMA
> APIs in the arch/arm folder. This change is meant to correct that so that
> we get consistent behavior.
I'm really not convinced that this is anywhere close to correct behaviour.
If we're DMA-ing to a buffer, and we unmap it or sync_for_cpu, then we
will want to access the DMA'd data - especially in the sync_for_cpu case,
it's pointless to call sync_for_cpu if we're not going to access the
data.
So the idea of skipping the CPU copy when DMA_ATTR_SKIP_CPU_SYNC is set
seems to be completely wrong - it means we end up reading the stale data
that was in the buffer, completely ignoring whatever was DMA'd to it.
What's the use case for DMA_ATTR_SKIP_CPU_SYNC ?
--
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
On Mon, Oct 31, 2016 at 3:20 AM, Russell King - ARM Linux
<[email protected]> wrote:
> On Mon, Oct 24, 2016 at 08:04:47AM -0400, Alexander Duyck wrote:
>> The use of DMA_ATTR_SKIP_CPU_SYNC was not consistent across all of the DMA
>> APIs in the arch/arm folder. This change is meant to correct that so that
>> we get consistent behavior.
>
> I'm really not convinced that this is anywhere close to correct behaviour.
>
> If we're DMA-ing to a buffer, and we unmap it or sync_for_cpu, then we
> will want to access the DMA'd data - especially in the sync_for_cpu case,
> it's pointless to call sync_for_cpu if we're not going to access the
> data.
First, let me clarify. The sync_for_cpu call will still work the
same. This only effects the map/unmap calls.
> So the idea of skipping the CPU copy when DMA_ATTR_SKIP_CPU_SYNC is set
> seems to be completely wrong - it means we end up reading the stale data
> that was in the buffer, completely ignoring whatever was DMA'd to it.
I agree. However this is meant to be used in the dma_unmap call only
if sync_for_cpu has already been called for the regions that could
have been updated by the device.
> What's the use case for DMA_ATTR_SKIP_CPU_SYNC ?
The main use case I have in mind is to allow for pseudo-static DMA
mappings where we can share them between the network stack and the
device driver. I use igb as an example.
1 allocate page, reset page_offset to 0
2 map page while passing DMA_ATTR_SKIP_CPU_SYNC
3 dma_sync_single_range_for_device starting at page_offset, length
2K (largest possible write by device)
4 device performs Rx DMA and updates Rx descriptor
5 read length from Rx descriptor
6 dma_sync_single_range_for_cpu starting at page_offset, length
reported by descriptor
7 if page_count == 1
7.1 update page_offset with xor 2K
7.2 hand page up to network stack
7.3 goto 3
8 unmap page with DMA_ATTR_SKIP_CPU_SYNC
9 hand page up to network stack
10 goto 1
The idea is we want to be able to have a page be accessible to the
device, but be able to share it with the network stack which might try
to write to the page. By letting the driver handle the
synchronization we get two main advantages. First we end up looping
over fewer cache lines as we only have to invalidate the region
updated by the device in steps 3 and 6 instead of the entire page.
The other advantage is that the pages are writable by the network
stack since step 8 will not invalidate the entire mapping.
I am just as concerned about the possibility of stale data. That is
why I have gone through and made sure that any path in the igb driver
called sync for the region held by the device before we might call
unmap. It isn't that I don't want the data to be kept fresh, it is a
matter of wanting control over what we are invalidating. Here is a
link to the igb patch I have that was a part of this set.
https://patchwork.ozlabs.org/patch/686747/
Thanks.
- Alex