2009-07-10 01:12:02

by FUJITA Tomonori

[permalink] [raw]
Subject: [00/15] swiotlb cleanup

- removes unused (and unnecessary) hooks in swiotlb.

- adds dma_capable() and converts swiotlb to use it. It can be used to
know if a memory area is dma capable or not. I added
is_buffer_dma_capable() for the same purpose long ago but it turned
out that the function doesn't work on POWERPC.

This can be applied cleanly to linux-next, -mm, and mainline. This
patchset touches multiple architectures (ia64, powerpc, x86) so I
guess that -mm is appropriate for this patchset (I don't care much
what tree would merge this though).

This is tested on x86 but only compile tested on POWERPC and IA64.

Thanks,

=
arch/ia64/include/asm/dma-mapping.h | 18 ++++++
arch/powerpc/include/asm/dma-mapping.h | 23 +++++++
arch/powerpc/kernel/dma-swiotlb.c | 48 +---------------
arch/x86/include/asm/dma-mapping.h | 18 ++++++
arch/x86/kernel/pci-dma.c | 2 +-
arch/x86/kernel/pci-gart_64.c | 5 +-
arch/x86/kernel/pci-nommu.c | 2 +-
arch/x86/kernel/pci-swiotlb.c | 25 --------
include/linux/dma-mapping.h | 5 --
include/linux/swiotlb.h | 11 ----
lib/swiotlb.c | 102 +++++++++-----------------------
11 files changed, 92 insertions(+), 167 deletions(-)



2009-07-10 01:09:29

by FUJITA Tomonori

[permalink] [raw]
Subject: [PATCH 01/15] swiotlb: remove unused swiotlb_alloc_boot()

Nobody uses swiotlb_alloc_boot().

Signed-off-by: FUJITA Tomonori <[email protected]>
---
arch/x86/kernel/pci-swiotlb.c | 5 -----
include/linux/swiotlb.h | 2 --
lib/swiotlb.c | 7 +------
3 files changed, 1 insertions(+), 13 deletions(-)

diff --git a/arch/x86/kernel/pci-swiotlb.c b/arch/x86/kernel/pci-swiotlb.c
index 6af96ee..0ac7cd5 100644
--- a/arch/x86/kernel/pci-swiotlb.c
+++ b/arch/x86/kernel/pci-swiotlb.c
@@ -13,11 +13,6 @@

int swiotlb __read_mostly;

-void * __init swiotlb_alloc_boot(size_t size, unsigned long nslabs)
-{
- return alloc_bootmem_low_pages(size);
-}
-
void *swiotlb_alloc(unsigned order, unsigned long nslabs)
{
return (void *)__get_free_pages(GFP_DMA | __GFP_NOWARN, order);
diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index cb1a663..94db704 100644
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -14,7 +14,6 @@ struct scatterlist;
*/
#define IO_TLB_SEGSIZE 128

-
/*
* log of the size of each IO TLB slab. The number of slabs is command line
* controllable.
@@ -24,7 +23,6 @@ struct scatterlist;
extern void
swiotlb_init(void);

-extern void *swiotlb_alloc_boot(size_t bytes, unsigned long nslabs);
extern void *swiotlb_alloc(unsigned order, unsigned long nslabs);

extern dma_addr_t swiotlb_phys_to_bus(struct device *hwdev,
diff --git a/lib/swiotlb.c b/lib/swiotlb.c
index bffe6d7..9edfdd4 100644
--- a/lib/swiotlb.c
+++ b/lib/swiotlb.c
@@ -114,11 +114,6 @@ setup_io_tlb_npages(char *str)
__setup("swiotlb=", setup_io_tlb_npages);
/* make io_tlb_overflow tunable too? */

-void * __weak __init swiotlb_alloc_boot(size_t size, unsigned long nslabs)
-{
- return alloc_bootmem_low_pages(size);
-}
-
void * __weak swiotlb_alloc(unsigned order, unsigned long nslabs)
{
return (void *)__get_free_pages(GFP_DMA | __GFP_NOWARN, order);
@@ -189,7 +184,7 @@ swiotlb_init_with_default_size(size_t default_size)
/*
* Get IO TLB memory from the low pages
*/
- io_tlb_start = swiotlb_alloc_boot(bytes, io_tlb_nslabs);
+ io_tlb_start = alloc_bootmem_low_pages(bytes);
if (!io_tlb_start)
panic("Cannot allocate SWIOTLB buffer");
io_tlb_end = io_tlb_start + bytes;
--
1.6.0.6

2009-07-10 01:09:43

by FUJITA Tomonori

[permalink] [raw]
Subject: [PATCH 06/15] x86: replace is_buffer_dma_capable() with dma_capable

Signed-off-by: FUJITA Tomonori <[email protected]>
---
arch/x86/kernel/pci-dma.c | 2 +-
arch/x86/kernel/pci-gart_64.c | 5 ++---
arch/x86/kernel/pci-nommu.c | 2 +-
3 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/pci-dma.c b/arch/x86/kernel/pci-dma.c
index 1a041bc..3c945c0 100644
--- a/arch/x86/kernel/pci-dma.c
+++ b/arch/x86/kernel/pci-dma.c
@@ -147,7 +147,7 @@ again:
return NULL;

addr = page_to_phys(page);
- if (!is_buffer_dma_capable(dma_mask, addr, size)) {
+ if (addr + size > dma_mask) {
__free_pages(page, get_order(size));

if (dma_mask < DMA_BIT_MASK(32) && !(flag & GFP_DMA)) {
diff --git a/arch/x86/kernel/pci-gart_64.c b/arch/x86/kernel/pci-gart_64.c
index d2e56b8..98a827e 100644
--- a/arch/x86/kernel/pci-gart_64.c
+++ b/arch/x86/kernel/pci-gart_64.c
@@ -190,14 +190,13 @@ static void iommu_full(struct device *dev, size_t size, int dir)
static inline int
need_iommu(struct device *dev, unsigned long addr, size_t size)
{
- return force_iommu ||
- !is_buffer_dma_capable(*dev->dma_mask, addr, size);
+ return force_iommu || !dma_capable(dev, addr, size);
}

static inline int
nonforced_iommu(struct device *dev, unsigned long addr, size_t size)
{
- return !is_buffer_dma_capable(*dev->dma_mask, addr, size);
+ return !dma_capable(dev, addr, size);
}

/* Map a single continuous physical area into the IOMMU.
diff --git a/arch/x86/kernel/pci-nommu.c b/arch/x86/kernel/pci-nommu.c
index 71d412a..c0a4222 100644
--- a/arch/x86/kernel/pci-nommu.c
+++ b/arch/x86/kernel/pci-nommu.c
@@ -14,7 +14,7 @@
static int
check_addr(char *name, struct device *hwdev, dma_addr_t bus, size_t size)
{
- if (hwdev && !is_buffer_dma_capable(*hwdev->dma_mask, bus, size)) {
+ if (hwdev && !dma_capable(hwdev, bus, size)) {
if (*hwdev->dma_mask >= DMA_BIT_MASK(32))
printk(KERN_ERR
"nommu_%s: overflow %Lx+%zu of device mask %Lx\n",
--
1.6.0.6

2009-07-10 01:09:54

by FUJITA Tomonori

[permalink] [raw]
Subject: [PATCH 03/15] swiotlb: remove swiotlb_arch_range_needs_mapping

Nobody uses swiotlb_arch_range_needs_mapping().

Signed-off-by: FUJITA Tomonori <[email protected]>
---
arch/x86/kernel/pci-swiotlb.c | 5 -----
include/linux/swiotlb.h | 2 --
lib/swiotlb.c | 15 ++-------------
3 files changed, 2 insertions(+), 20 deletions(-)

diff --git a/arch/x86/kernel/pci-swiotlb.c b/arch/x86/kernel/pci-swiotlb.c
index ea675cf..165bd7f 100644
--- a/arch/x86/kernel/pci-swiotlb.c
+++ b/arch/x86/kernel/pci-swiotlb.c
@@ -23,11 +23,6 @@ phys_addr_t swiotlb_bus_to_phys(struct device *hwdev, dma_addr_t baddr)
return baddr;
}

-int __weak swiotlb_arch_range_needs_mapping(phys_addr_t paddr, size_t size)
-{
- return 0;
-}
-
static void *x86_swiotlb_alloc_coherent(struct device *hwdev, size_t size,
dma_addr_t *dma_handle, gfp_t flags)
{
diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index 6bc5094..a977da2 100644
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -28,8 +28,6 @@ extern dma_addr_t swiotlb_phys_to_bus(struct device *hwdev,
extern phys_addr_t swiotlb_bus_to_phys(struct device *hwdev,
dma_addr_t address);

-extern int swiotlb_arch_range_needs_mapping(phys_addr_t paddr, size_t size);
-
extern void
*swiotlb_alloc_coherent(struct device *hwdev, size_t size,
dma_addr_t *dma_handle, gfp_t flags);
diff --git a/lib/swiotlb.c b/lib/swiotlb.c
index 3c4c21c..dc1cd1f 100644
--- a/lib/swiotlb.c
+++ b/lib/swiotlb.c
@@ -141,11 +141,6 @@ int __weak swiotlb_arch_address_needs_mapping(struct device *hwdev,
return !is_buffer_dma_capable(dma_get_mask(hwdev), addr, size);
}

-int __weak swiotlb_arch_range_needs_mapping(phys_addr_t paddr, size_t size)
-{
- return 0;
-}
-
static void swiotlb_print_info(unsigned long bytes)
{
phys_addr_t pstart, pend;
@@ -312,11 +307,6 @@ address_needs_mapping(struct device *hwdev, dma_addr_t addr, size_t size)
return swiotlb_arch_address_needs_mapping(hwdev, addr, size);
}

-static inline int range_needs_mapping(phys_addr_t paddr, size_t size)
-{
- return swiotlb_force || swiotlb_arch_range_needs_mapping(paddr, size);
-}
-
static int is_swiotlb_buffer(char *addr)
{
return addr >= io_tlb_start && addr < io_tlb_end;
@@ -646,8 +636,7 @@ dma_addr_t swiotlb_map_page(struct device *dev, struct page *page,
* we can safely return the device addr and not worry about bounce
* buffering it.
*/
- if (!address_needs_mapping(dev, dev_addr, size) &&
- !range_needs_mapping(phys, size))
+ if (!address_needs_mapping(dev, dev_addr, size) && !swiotlb_force)
return dev_addr;

/*
@@ -810,7 +799,7 @@ swiotlb_map_sg_attrs(struct device *hwdev, struct scatterlist *sgl, int nelems,
phys_addr_t paddr = sg_phys(sg);
dma_addr_t dev_addr = swiotlb_phys_to_bus(hwdev, paddr);

- if (range_needs_mapping(paddr, sg->length) ||
+ if (swiotlb_force ||
address_needs_mapping(hwdev, dev_addr, sg->length)) {
void *map = map_single(hwdev, sg_phys(sg),
sg->length, dir);
--
1.6.0.6

2009-07-10 01:10:08

by FUJITA Tomonori

[permalink] [raw]
Subject: [PATCH 10/15] powerpc: remove unncesary swiotlb_arch_address_needs_mapping

swiotlb doesn't use swiotlb_arch_address_needs_mapping(); it uses
dma_capalbe(). We can remove unnecessary
swiotlb_arch_address_needs_mapping().

We can remove swiotlb_addr_needs_map() and is_buffer_dma_capable() in
swiotlb_pci_addr_needs_map() too; dma_capable() handles the features
that both provide.

Signed-off-by: FUJITA Tomonori <[email protected]>
---
arch/powerpc/kernel/dma-swiotlb.c | 27 +--------------------------
1 files changed, 1 insertions(+), 26 deletions(-)

diff --git a/arch/powerpc/kernel/dma-swiotlb.c b/arch/powerpc/kernel/dma-swiotlb.c
index 41534ae..a3bbe02 100644
--- a/arch/powerpc/kernel/dma-swiotlb.c
+++ b/arch/powerpc/kernel/dma-swiotlb.c
@@ -36,28 +36,11 @@ phys_addr_t swiotlb_bus_to_phys(struct device *hwdev, dma_addr_t baddr)
}

/*
- * Determine if an address needs bounce buffering via swiotlb.
- * Going forward I expect the swiotlb code to generalize on using
- * a dma_ops->addr_needs_map, and this function will move from here to the
- * generic swiotlb code.
- */
-int
-swiotlb_arch_address_needs_mapping(struct device *hwdev, dma_addr_t addr,
- size_t size)
-{
- struct dma_mapping_ops *dma_ops = get_dma_ops(hwdev);
-
- BUG_ON(!dma_ops);
- return dma_ops->addr_needs_map(hwdev, addr, size);
-}
-
-/*
* Determine if an address is reachable by a pci device, or if we must bounce.
*/
static int
swiotlb_pci_addr_needs_map(struct device *hwdev, dma_addr_t addr, size_t size)
{
- u64 mask = dma_get_mask(hwdev);
dma_addr_t max;
struct pci_controller *hose;
struct pci_dev *pdev = to_pci_dev(hwdev);
@@ -69,16 +52,9 @@ swiotlb_pci_addr_needs_map(struct device *hwdev, dma_addr_t addr, size_t size)
if ((addr + size > max) | (addr < hose->dma_window_base_cur))
return 1;

- return !is_buffer_dma_capable(mask, addr, size);
-}
-
-static int
-swiotlb_addr_needs_map(struct device *hwdev, dma_addr_t addr, size_t size)
-{
- return !is_buffer_dma_capable(dma_get_mask(hwdev), addr, size);
+ return 0;
}

-
/*
* At the moment, all platforms that use this code only require
* swiotlb to be used if we're operating on HIGHMEM. Since
@@ -94,7 +70,6 @@ struct dma_mapping_ops swiotlb_dma_ops = {
.dma_supported = swiotlb_dma_supported,
.map_page = swiotlb_map_page,
.unmap_page = swiotlb_unmap_page,
- .addr_needs_map = swiotlb_addr_needs_map,
.sync_single_range_for_cpu = swiotlb_sync_single_range_for_cpu,
.sync_single_range_for_device = swiotlb_sync_single_range_for_device,
.sync_sg_for_cpu = swiotlb_sync_sg_for_cpu,
--
1.6.0.6

2009-07-10 01:10:26

by FUJITA Tomonori

[permalink] [raw]
Subject: [PATCH 12/15] x86, IA64, powerpc: add phys_to_dma() and dma_to_phys()

This adds two functions, phys_to_dma() and dma_to_phys() to x86, IA64
and powerpc. swiotlb uses them. phys_to_dma() converts a physical
address to a dma address. dma_to_phys() does the opposite.

Signed-off-by: FUJITA Tomonori <[email protected]>
---
arch/ia64/include/asm/dma-mapping.h | 10 ++++++++++
arch/powerpc/include/asm/dma-mapping.h | 10 ++++++++++
arch/x86/include/asm/dma-mapping.h | 10 ++++++++++
3 files changed, 30 insertions(+), 0 deletions(-)

diff --git a/arch/ia64/include/asm/dma-mapping.h b/arch/ia64/include/asm/dma-mapping.h
index 88d0f86..f91829d 100644
--- a/arch/ia64/include/asm/dma-mapping.h
+++ b/arch/ia64/include/asm/dma-mapping.h
@@ -77,6 +77,16 @@ static inline bool dma_capable(struct device *dev, dma_addr_t addr, size_t size)
return addr + size <= *dev->dma_mask;
}

+static inline dma_addr_t phys_to_dma(struct device *dev, phys_addr_t paddr)
+{
+ return paddr;
+}
+
+static inline phys_addr_t dma_to_phys(struct device *dev, dma_addr_t daddr)
+{
+ return daddr;
+}
+
extern int dma_get_cache_alignment(void);

static inline void
diff --git a/arch/powerpc/include/asm/dma-mapping.h b/arch/powerpc/include/asm/dma-mapping.h
index 6ff1f85..0c34371 100644
--- a/arch/powerpc/include/asm/dma-mapping.h
+++ b/arch/powerpc/include/asm/dma-mapping.h
@@ -437,6 +437,16 @@ static inline bool dma_capable(struct device *dev, dma_addr_t addr, size_t size)
return addr + size <= *dev->dma_mask;
}

+static inline dma_addr_t phys_to_dma(struct device *dev, phys_addr_t paddr)
+{
+ return paddr + get_dma_direct_offset(dev);
+}
+
+static inline phys_addr_t dma_to_phys(struct device *dev, dma_addr_t daddr)
+{
+ return daddr - get_dma_direct_offset(dev);
+}
+
#define dma_alloc_noncoherent(d, s, h, f) dma_alloc_coherent(d, s, h, f)
#define dma_free_noncoherent(d, s, v, h) dma_free_coherent(d, s, v, h)
#ifdef CONFIG_NOT_COHERENT_CACHE
diff --git a/arch/x86/include/asm/dma-mapping.h b/arch/x86/include/asm/dma-mapping.h
index adac59c..0ee770d 100644
--- a/arch/x86/include/asm/dma-mapping.h
+++ b/arch/x86/include/asm/dma-mapping.h
@@ -63,6 +63,16 @@ static inline bool dma_capable(struct device *dev, dma_addr_t addr, size_t size)
return addr + size <= *dev->dma_mask;
}

+static inline dma_addr_t phys_to_dma(struct device *dev, phys_addr_t paddr)
+{
+ return paddr;
+}
+
+static inline phys_addr_t dma_to_phys(struct device *dev, dma_addr_t daddr)
+{
+ return daddr;
+}
+
static inline void
dma_cache_sync(struct device *dev, void *vaddr, size_t size,
enum dma_data_direction dir)
--
1.6.0.6

2009-07-10 01:10:42

by FUJITA Tomonori

[permalink] [raw]
Subject: [PATCH 11/15] remove is_buffer_dma_capable()

is_buffer_dma_capable() was replaced with dma_capable().

is_buffer_dma_capable() tells if a buffer is dma-capable or
not. However, it doesn't take a pointer to struct device so it doesn't
work for POWERPC.

Signed-off-by: FUJITA Tomonori <[email protected]>
---
include/linux/dma-mapping.h | 5 -----
1 files changed, 0 insertions(+), 5 deletions(-)

diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index 07dfd46..c0f6c3c 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -98,11 +98,6 @@ static inline int is_device_dma_capable(struct device *dev)
return dev->dma_mask != NULL && *dev->dma_mask != DMA_MASK_NONE;
}

-static inline int is_buffer_dma_capable(u64 mask, dma_addr_t addr, size_t size)
-{
- return addr + size <= mask;
-}
-
#ifdef CONFIG_HAS_DMA
#include <asm/dma-mapping.h>
#else
--
1.6.0.6

2009-07-10 01:10:54

by FUJITA Tomonori

[permalink] [raw]
Subject: [PATCH 13/15] swiotlb: use phys_to_dma and dma_to_phys

This converts swiotlb to use phys_to_dma and dma_to_phys instead of
swiotlb_phys_to_bus() and swiotlb_bus_to_phys().

swiotlb_phys_to_bus() and swiotlb_bus_to_phys() are not necessary so
this patch also removes them.

Signed-off-by: FUJITA Tomonori <[email protected]>
---
include/linux/swiotlb.h | 5 -----
lib/swiotlb.c | 22 ++++++----------------
2 files changed, 6 insertions(+), 21 deletions(-)

diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index a977da2..73b1f1c 100644
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -23,11 +23,6 @@ struct scatterlist;
extern void
swiotlb_init(void);

-extern dma_addr_t swiotlb_phys_to_bus(struct device *hwdev,
- phys_addr_t address);
-extern phys_addr_t swiotlb_bus_to_phys(struct device *hwdev,
- dma_addr_t address);
-
extern void
*swiotlb_alloc_coherent(struct device *hwdev, size_t size,
dma_addr_t *dma_handle, gfp_t flags);
diff --git a/lib/swiotlb.c b/lib/swiotlb.c
index a0faeb0..de71dba 100644
--- a/lib/swiotlb.c
+++ b/lib/swiotlb.c
@@ -114,20 +114,10 @@ setup_io_tlb_npages(char *str)
__setup("swiotlb=", setup_io_tlb_npages);
/* make io_tlb_overflow tunable too? */

-dma_addr_t __weak swiotlb_phys_to_bus(struct device *hwdev, phys_addr_t paddr)
-{
- return paddr;
-}
-
-phys_addr_t __weak swiotlb_bus_to_phys(struct device *hwdev, dma_addr_t baddr)
-{
- return baddr;
-}
-
static dma_addr_t swiotlb_virt_to_bus(struct device *hwdev,
volatile void *address)
{
- return swiotlb_phys_to_bus(hwdev, virt_to_phys(address));
+ return phys_to_dma(hwdev, virt_to_phys(address));
}

static void swiotlb_print_info(unsigned long bytes)
@@ -566,7 +556,7 @@ void
swiotlb_free_coherent(struct device *hwdev, size_t size, void *vaddr,
dma_addr_t dev_addr)
{
- phys_addr_t paddr = swiotlb_bus_to_phys(hwdev, dev_addr);
+ phys_addr_t paddr = dma_to_phys(hwdev, dev_addr);

WARN_ON(irqs_disabled());
if (!is_swiotlb_buffer(paddr))
@@ -611,7 +601,7 @@ dma_addr_t swiotlb_map_page(struct device *dev, struct page *page,
struct dma_attrs *attrs)
{
phys_addr_t phys = page_to_phys(page) + offset;
- dma_addr_t dev_addr = swiotlb_phys_to_bus(dev, phys);
+ dma_addr_t dev_addr = phys_to_dma(dev, phys);
void *map;

BUG_ON(dir == DMA_NONE);
@@ -655,7 +645,7 @@ EXPORT_SYMBOL_GPL(swiotlb_map_page);
static void unmap_single(struct device *hwdev, dma_addr_t dev_addr,
size_t size, int dir)
{
- phys_addr_t paddr = swiotlb_bus_to_phys(hwdev, dev_addr);
+ phys_addr_t paddr = dma_to_phys(hwdev, dev_addr);

BUG_ON(dir == DMA_NONE);

@@ -692,7 +682,7 @@ static void
swiotlb_sync_single(struct device *hwdev, dma_addr_t dev_addr,
size_t size, int dir, int target)
{
- phys_addr_t paddr = swiotlb_bus_to_phys(hwdev, dev_addr);
+ phys_addr_t paddr = dma_to_phys(hwdev, dev_addr);

BUG_ON(dir == DMA_NONE);

@@ -781,7 +771,7 @@ swiotlb_map_sg_attrs(struct device *hwdev, struct scatterlist *sgl, int nelems,

for_each_sg(sgl, sg, nelems, i) {
phys_addr_t paddr = sg_phys(sg);
- dma_addr_t dev_addr = swiotlb_phys_to_bus(hwdev, paddr);
+ dma_addr_t dev_addr = phys_to_dma(hwdev, paddr);

if (swiotlb_force ||
!dma_capable(hwdev, dev_addr, sg->length)) {
--
1.6.0.6

2009-07-10 01:11:12

by FUJITA Tomonori

[permalink] [raw]
Subject: [PATCH 15/15] x86: remove unused swiotlb_phys_to_bus() and swiotlb_bus_to_phys()

phys_to_dma() and dma_to_phys() are used instead of
swiotlb_phys_to_bus() and swiotlb_bus_to_phys().

Signed-off-by: FUJITA Tomonori <[email protected]>
---
arch/x86/kernel/pci-swiotlb.c | 10 ----------
1 files changed, 0 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kernel/pci-swiotlb.c b/arch/x86/kernel/pci-swiotlb.c
index 165bd7f..e8a3501 100644
--- a/arch/x86/kernel/pci-swiotlb.c
+++ b/arch/x86/kernel/pci-swiotlb.c
@@ -13,16 +13,6 @@

int swiotlb __read_mostly;

-dma_addr_t swiotlb_phys_to_bus(struct device *hwdev, phys_addr_t paddr)
-{
- return paddr;
-}
-
-phys_addr_t swiotlb_bus_to_phys(struct device *hwdev, dma_addr_t baddr)
-{
- return baddr;
-}
-
static void *x86_swiotlb_alloc_coherent(struct device *hwdev, size_t size,
dma_addr_t *dma_handle, gfp_t flags)
{
--
1.6.0.6

2009-07-10 01:11:49

by FUJITA Tomonori

[permalink] [raw]
Subject: [PATCH 07/15] ia64: add dma_capable() to replace is_buffer_dma_capable()

dma_capable() eventually replaces is_buffer_dma_capable(), which tells
if a memory area is dma-capable or not. The problem of
is_buffer_dma_capable() is that it doesn't take a pointer to struct
device so it doesn't work for POWERPC.

Signed-off-by: FUJITA Tomonori <[email protected]>
---
arch/ia64/include/asm/dma-mapping.h | 8 ++++++++
1 files changed, 8 insertions(+), 0 deletions(-)

diff --git a/arch/ia64/include/asm/dma-mapping.h b/arch/ia64/include/asm/dma-mapping.h
index 5a61b5c..88d0f86 100644
--- a/arch/ia64/include/asm/dma-mapping.h
+++ b/arch/ia64/include/asm/dma-mapping.h
@@ -69,6 +69,14 @@ dma_set_mask (struct device *dev, u64 mask)
return 0;
}

+static inline bool dma_capable(struct device *dev, dma_addr_t addr, size_t size)
+{
+ if (!dev->dma_mask)
+ return 0;
+
+ return addr + size <= *dev->dma_mask;
+}
+
extern int dma_get_cache_alignment(void);

static inline void
--
1.6.0.6

2009-07-10 01:11:26

by FUJITA Tomonori

[permalink] [raw]
Subject: [PATCH 14/15] powerpc: remove unused swiotlb_phys_to_bus() and swiotlb_bus_to_phys()

phys_to_dma() and dma_to_phys() are used instead of
swiotlb_phys_to_bus() and swiotlb_bus_to_phys().

Signed-off-by: FUJITA Tomonori <[email protected]>
---
arch/powerpc/kernel/dma-swiotlb.c | 11 -----------
1 files changed, 0 insertions(+), 11 deletions(-)

diff --git a/arch/powerpc/kernel/dma-swiotlb.c b/arch/powerpc/kernel/dma-swiotlb.c
index a3bbe02..e8a57de 100644
--- a/arch/powerpc/kernel/dma-swiotlb.c
+++ b/arch/powerpc/kernel/dma-swiotlb.c
@@ -24,17 +24,6 @@
int swiotlb __read_mostly;
unsigned int ppc_swiotlb_enable;

-dma_addr_t swiotlb_phys_to_bus(struct device *hwdev, phys_addr_t paddr)
-{
- return paddr + get_dma_direct_offset(hwdev);
-}
-
-phys_addr_t swiotlb_bus_to_phys(struct device *hwdev, dma_addr_t baddr)
-
-{
- return baddr - get_dma_direct_offset(hwdev);
-}
-
/*
* Determine if an address is reachable by a pci device, or if we must bounce.
*/
--
1.6.0.6

2009-07-10 01:11:38

by FUJITA Tomonori

[permalink] [raw]
Subject: [PATCH 08/15] powerpc: add dma_capable() to replace is_buffer_dma_capable()

dma_capable() eventually replaces is_buffer_dma_capable(), which tells
if a memory area is dma-capable or not. The problem of
is_buffer_dma_capable() is that it doesn't take a pointer to struct
device so it doesn't work for POWERPC.

Signed-off-by: FUJITA Tomonori <[email protected]>
---
arch/powerpc/include/asm/dma-mapping.h | 13 +++++++++++++
1 files changed, 13 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/dma-mapping.h b/arch/powerpc/include/asm/dma-mapping.h
index b44aaab..6ff1f85 100644
--- a/arch/powerpc/include/asm/dma-mapping.h
+++ b/arch/powerpc/include/asm/dma-mapping.h
@@ -424,6 +424,19 @@ static inline int dma_mapping_error(struct device *dev, dma_addr_t dma_addr)
#endif
}

+static inline bool dma_capable(struct device *dev, dma_addr_t addr, size_t size)
+{
+ struct dma_mapping_ops *ops = get_dma_ops(dev);
+
+ if (ops->addr_needs_map && ops->addr_needs_map(dev, addr, size))
+ return 0;
+
+ if (!dev->dma_mask)
+ return 0;
+
+ return addr + size <= *dev->dma_mask;
+}
+
#define dma_alloc_noncoherent(d, s, h, f) dma_alloc_coherent(d, s, h, f)
#define dma_free_noncoherent(d, s, v, h) dma_free_coherent(d, s, v, h)
#ifdef CONFIG_NOT_COHERENT_CACHE
--
1.6.0.6

2009-07-10 01:12:24

by FUJITA Tomonori

[permalink] [raw]
Subject: [PATCH 02/15] swiotlb: remove unused swiotlb_alloc()

Nobody uses swiotlb_alloc().

Signed-off-by: FUJITA Tomonori <[email protected]>
---
arch/x86/kernel/pci-swiotlb.c | 5 -----
include/linux/swiotlb.h | 2 --
lib/swiotlb.c | 8 ++------
3 files changed, 2 insertions(+), 13 deletions(-)

diff --git a/arch/x86/kernel/pci-swiotlb.c b/arch/x86/kernel/pci-swiotlb.c
index 0ac7cd5..ea675cf 100644
--- a/arch/x86/kernel/pci-swiotlb.c
+++ b/arch/x86/kernel/pci-swiotlb.c
@@ -13,11 +13,6 @@

int swiotlb __read_mostly;

-void *swiotlb_alloc(unsigned order, unsigned long nslabs)
-{
- return (void *)__get_free_pages(GFP_DMA | __GFP_NOWARN, order);
-}
-
dma_addr_t swiotlb_phys_to_bus(struct device *hwdev, phys_addr_t paddr)
{
return paddr;
diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index 94db704..6bc5094 100644
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -23,8 +23,6 @@ struct scatterlist;
extern void
swiotlb_init(void);

-extern void *swiotlb_alloc(unsigned order, unsigned long nslabs);
-
extern dma_addr_t swiotlb_phys_to_bus(struct device *hwdev,
phys_addr_t address);
extern phys_addr_t swiotlb_bus_to_phys(struct device *hwdev,
diff --git a/lib/swiotlb.c b/lib/swiotlb.c
index 9edfdd4..3c4c21c 100644
--- a/lib/swiotlb.c
+++ b/lib/swiotlb.c
@@ -114,11 +114,6 @@ setup_io_tlb_npages(char *str)
__setup("swiotlb=", setup_io_tlb_npages);
/* make io_tlb_overflow tunable too? */

-void * __weak swiotlb_alloc(unsigned order, unsigned long nslabs)
-{
- return (void *)__get_free_pages(GFP_DMA | __GFP_NOWARN, order);
-}
-
dma_addr_t __weak swiotlb_phys_to_bus(struct device *hwdev, phys_addr_t paddr)
{
return paddr;
@@ -240,7 +235,8 @@ swiotlb_late_init_with_default_size(size_t default_size)
bytes = io_tlb_nslabs << IO_TLB_SHIFT;

while ((SLABS_PER_PAGE << order) > IO_TLB_MIN_SLABS) {
- io_tlb_start = swiotlb_alloc(order, io_tlb_nslabs);
+ io_tlb_start = (void *)__get_free_pages(GFP_DMA | __GFP_NOWARN,
+ order);
if (io_tlb_start)
break;
order--;
--
1.6.0.6

2009-07-10 01:12:35

by FUJITA Tomonori

[permalink] [raw]
Subject: [PATCH 09/15] swiotlb: use dma_capable()

This converts swiotlb to use dma_capable() instead of
swiotlb_arch_address_needs_mapping() and is_buffer_dma_capable().

Signed-off-by: FUJITA Tomonori <[email protected]>
---
lib/swiotlb.c | 24 +++++-------------------
1 files changed, 5 insertions(+), 19 deletions(-)

diff --git a/lib/swiotlb.c b/lib/swiotlb.c
index 1a89c84..a0faeb0 100644
--- a/lib/swiotlb.c
+++ b/lib/swiotlb.c
@@ -130,12 +130,6 @@ static dma_addr_t swiotlb_virt_to_bus(struct device *hwdev,
return swiotlb_phys_to_bus(hwdev, virt_to_phys(address));
}

-int __weak swiotlb_arch_address_needs_mapping(struct device *hwdev,
- dma_addr_t addr, size_t size)
-{
- return !is_buffer_dma_capable(dma_get_mask(hwdev), addr, size);
-}
-
static void swiotlb_print_info(unsigned long bytes)
{
phys_addr_t pstart, pend;
@@ -296,12 +290,6 @@ cleanup1:
return -ENOMEM;
}

-static inline int
-address_needs_mapping(struct device *hwdev, dma_addr_t addr, size_t size)
-{
- return swiotlb_arch_address_needs_mapping(hwdev, addr, size);
-}
-
static int is_swiotlb_buffer(phys_addr_t paddr)
{
return paddr >= virt_to_phys(io_tlb_start) &&
@@ -538,9 +526,7 @@ swiotlb_alloc_coherent(struct device *hwdev, size_t size,
dma_mask = hwdev->coherent_dma_mask;

ret = (void *)__get_free_pages(flags, order);
- if (ret &&
- !is_buffer_dma_capable(dma_mask, swiotlb_virt_to_bus(hwdev, ret),
- size)) {
+ if (ret && swiotlb_virt_to_bus(hwdev, ret) + size > dma_mask) {
/*
* The allocated memory isn't reachable by the device.
*/
@@ -562,7 +548,7 @@ swiotlb_alloc_coherent(struct device *hwdev, size_t size,
dev_addr = swiotlb_virt_to_bus(hwdev, ret);

/* Confirm address can be DMA'd by device */
- if (!is_buffer_dma_capable(dma_mask, dev_addr, size)) {
+ if (dev_addr + size > dma_mask) {
printk("hwdev DMA mask = 0x%016Lx, dev_addr = 0x%016Lx\n",
(unsigned long long)dma_mask,
(unsigned long long)dev_addr);
@@ -634,7 +620,7 @@ dma_addr_t swiotlb_map_page(struct device *dev, struct page *page,
* we can safely return the device addr and not worry about bounce
* buffering it.
*/
- if (!address_needs_mapping(dev, dev_addr, size) && !swiotlb_force)
+ if (dma_capable(dev, dev_addr, size) && !swiotlb_force)
return dev_addr;

/*
@@ -651,7 +637,7 @@ dma_addr_t swiotlb_map_page(struct device *dev, struct page *page,
/*
* Ensure that the address returned is DMA'ble
*/
- if (address_needs_mapping(dev, dev_addr, size))
+ if (!dma_capable(dev, dev_addr, size))
panic("map_single: bounce buffer is not DMA'ble");

return dev_addr;
@@ -798,7 +784,7 @@ swiotlb_map_sg_attrs(struct device *hwdev, struct scatterlist *sgl, int nelems,
dma_addr_t dev_addr = swiotlb_phys_to_bus(hwdev, paddr);

if (swiotlb_force ||
- address_needs_mapping(hwdev, dev_addr, sg->length)) {
+ !dma_capable(hwdev, dev_addr, sg->length)) {
void *map = map_single(hwdev, sg_phys(sg),
sg->length, dir);
if (!map) {
--
1.6.0.6

2009-07-10 01:12:47

by FUJITA Tomonori

[permalink] [raw]
Subject: [PATCH 04/15] swiotlb: remove unnecessary swiotlb_bus_to_virt

swiotlb_bus_to_virt is unncessary; we can use swiotlb_bus_to_phys and
phys_to_virt instead.

Signed-off-by: FUJITA Tomonori <[email protected]>
---
arch/powerpc/kernel/dma-swiotlb.c | 10 ----------
lib/swiotlb.c | 34 ++++++++++++++++------------------
2 files changed, 16 insertions(+), 28 deletions(-)

diff --git a/arch/powerpc/kernel/dma-swiotlb.c b/arch/powerpc/kernel/dma-swiotlb.c
index 68ccf11..41534ae 100644
--- a/arch/powerpc/kernel/dma-swiotlb.c
+++ b/arch/powerpc/kernel/dma-swiotlb.c
@@ -24,16 +24,6 @@
int swiotlb __read_mostly;
unsigned int ppc_swiotlb_enable;

-void *swiotlb_bus_to_virt(struct device *hwdev, dma_addr_t addr)
-{
- unsigned long pfn = PFN_DOWN(swiotlb_bus_to_phys(hwdev, addr));
- void *pageaddr = page_address(pfn_to_page(pfn));
-
- if (pageaddr != NULL)
- return pageaddr + (addr % PAGE_SIZE);
- return NULL;
-}
-
dma_addr_t swiotlb_phys_to_bus(struct device *hwdev, phys_addr_t paddr)
{
return paddr + get_dma_direct_offset(hwdev);
diff --git a/lib/swiotlb.c b/lib/swiotlb.c
index dc1cd1f..1a89c84 100644
--- a/lib/swiotlb.c
+++ b/lib/swiotlb.c
@@ -130,11 +130,6 @@ static dma_addr_t swiotlb_virt_to_bus(struct device *hwdev,
return swiotlb_phys_to_bus(hwdev, virt_to_phys(address));
}

-void * __weak swiotlb_bus_to_virt(struct device *hwdev, dma_addr_t address)
-{
- return phys_to_virt(swiotlb_bus_to_phys(hwdev, address));
-}
-
int __weak swiotlb_arch_address_needs_mapping(struct device *hwdev,
dma_addr_t addr, size_t size)
{
@@ -307,9 +302,10 @@ address_needs_mapping(struct device *hwdev, dma_addr_t addr, size_t size)
return swiotlb_arch_address_needs_mapping(hwdev, addr, size);
}

-static int is_swiotlb_buffer(char *addr)
+static int is_swiotlb_buffer(phys_addr_t paddr)
{
- return addr >= io_tlb_start && addr < io_tlb_end;
+ return paddr >= virt_to_phys(io_tlb_start) &&
+ paddr < virt_to_phys(io_tlb_end);
}

/*
@@ -582,11 +578,13 @@ EXPORT_SYMBOL(swiotlb_alloc_coherent);

void
swiotlb_free_coherent(struct device *hwdev, size_t size, void *vaddr,
- dma_addr_t dma_handle)
+ dma_addr_t dev_addr)
{
+ phys_addr_t paddr = swiotlb_bus_to_phys(hwdev, dev_addr);
+
WARN_ON(irqs_disabled());
- if (!is_swiotlb_buffer(vaddr))
- free_pages((unsigned long) vaddr, get_order(size));
+ if (!is_swiotlb_buffer(paddr))
+ free_pages((unsigned long)vaddr, get_order(size));
else
/* DMA_TO_DEVICE to avoid memcpy in unmap_single */
do_unmap_single(hwdev, vaddr, size, DMA_TO_DEVICE);
@@ -671,19 +669,19 @@ EXPORT_SYMBOL_GPL(swiotlb_map_page);
static void unmap_single(struct device *hwdev, dma_addr_t dev_addr,
size_t size, int dir)
{
- char *dma_addr = swiotlb_bus_to_virt(hwdev, dev_addr);
+ phys_addr_t paddr = swiotlb_bus_to_phys(hwdev, dev_addr);

BUG_ON(dir == DMA_NONE);

- if (is_swiotlb_buffer(dma_addr)) {
- do_unmap_single(hwdev, dma_addr, size, dir);
+ if (is_swiotlb_buffer(paddr)) {
+ do_unmap_single(hwdev, phys_to_virt(paddr), size, dir);
return;
}

if (dir != DMA_FROM_DEVICE)
return;

- dma_mark_clean(dma_addr, size);
+ dma_mark_clean(phys_to_virt(paddr), size);
}

void swiotlb_unmap_page(struct device *hwdev, dma_addr_t dev_addr,
@@ -708,19 +706,19 @@ static void
swiotlb_sync_single(struct device *hwdev, dma_addr_t dev_addr,
size_t size, int dir, int target)
{
- char *dma_addr = swiotlb_bus_to_virt(hwdev, dev_addr);
+ phys_addr_t paddr = swiotlb_bus_to_phys(hwdev, dev_addr);

BUG_ON(dir == DMA_NONE);

- if (is_swiotlb_buffer(dma_addr)) {
- sync_single(hwdev, dma_addr, size, dir, target);
+ if (is_swiotlb_buffer(paddr)) {
+ sync_single(hwdev, phys_to_virt(paddr), size, dir, target);
return;
}

if (dir != DMA_FROM_DEVICE)
return;

- dma_mark_clean(dma_addr, size);
+ dma_mark_clean(phys_to_virt(paddr), size);
}

void
--
1.6.0.6

2009-07-10 01:12:59

by FUJITA Tomonori

[permalink] [raw]
Subject: [PATCH 05/15] x86: add dma_capable() to replace is_buffer_dma_capable()

dma_capable() eventually replaces is_buffer_dma_capable(), which tells
if a memory area is dma-capable or not. The problem of
is_buffer_dma_capable() is that it doesn't take a pointer to struct
device so it doesn't work for POWERPC.

Signed-off-by: FUJITA Tomonori <[email protected]>
---
arch/x86/include/asm/dma-mapping.h | 8 ++++++++
1 files changed, 8 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/dma-mapping.h b/arch/x86/include/asm/dma-mapping.h
index 1c3f943..adac59c 100644
--- a/arch/x86/include/asm/dma-mapping.h
+++ b/arch/x86/include/asm/dma-mapping.h
@@ -55,6 +55,14 @@ extern int dma_set_mask(struct device *dev, u64 mask);
extern void *dma_generic_alloc_coherent(struct device *dev, size_t size,
dma_addr_t *dma_addr, gfp_t flag);

+static inline bool dma_capable(struct device *dev, dma_addr_t addr, size_t size)
+{
+ if (!dev->dma_mask)
+ return 0;
+
+ return addr + size <= *dev->dma_mask;
+}
+
static inline void
dma_cache_sync(struct device *dev, void *vaddr, size_t size,
enum dma_data_direction dir)
--
1.6.0.6

2009-07-10 05:13:19

by Ingo Molnar

[permalink] [raw]
Subject: Re: [00/15] swiotlb cleanup


* FUJITA Tomonori <[email protected]> wrote:

> - removes unused (and unnecessary) hooks in swiotlb.
>
> - adds dma_capable() and converts swiotlb to use it. It can be used to
> know if a memory area is dma capable or not. I added
> is_buffer_dma_capable() for the same purpose long ago but it turned
> out that the function doesn't work on POWERPC.
>
> This can be applied cleanly to linux-next, -mm, and mainline. This
> patchset touches multiple architectures (ia64, powerpc, x86) so I
> guess that -mm is appropriate for this patchset (I don't care much
> what tree would merge this though).
>
> This is tested on x86 but only compile tested on POWERPC and IA64.
>
> Thanks,
>
> =
> arch/ia64/include/asm/dma-mapping.h | 18 ++++++
> arch/powerpc/include/asm/dma-mapping.h | 23 +++++++
> arch/powerpc/kernel/dma-swiotlb.c | 48 +---------------
> arch/x86/include/asm/dma-mapping.h | 18 ++++++
> arch/x86/kernel/pci-dma.c | 2 +-
> arch/x86/kernel/pci-gart_64.c | 5 +-
> arch/x86/kernel/pci-nommu.c | 2 +-
> arch/x86/kernel/pci-swiotlb.c | 25 --------
> include/linux/dma-mapping.h | 5 --
> include/linux/swiotlb.h | 11 ----
> lib/swiotlb.c | 102 +++++++++-----------------------
> 11 files changed, 92 insertions(+), 167 deletions(-)

Hm, the functions and facilities you remove here were added as part
of preparatory patches for Xen guest support. You were aware of
them, you were involved in discussions about those aspects with Ian
and Jeremy but still you chose not to Cc: either of them and you
failed to address that aspect in the changelogs.

I'd like the Xen code to become cleaner more than anyone else here i
guess, but patch submission methods like this are not really
helpful. A far better method is to be open about such disagreements,
to declare them, to Cc: everyone who disagrees, and to line out the
arguments in the changelogs as well - instead of just curtly
declaring those APIs 'unused' and failing to Cc: involved parties.

Alas, on the technical level the cleanups themselves look mostly
fine to me. Ian, Jeremy, the changes will alter Xen's use of
swiotlb, but can the Xen side still live with these new methods - in
particular is dma_capable() sufficient as a mechanism and can the
Xen side filter out DMA allocations to make them physically
continuous?

Ben, Tony, Becky, any objections wrt. the PowerPC / IA64 impact? If
everyone agrees i can apply them to the IOMMU tree, test it and push
it out to -next, etc.

Ingo

2009-07-10 05:36:51

by FUJITA Tomonori

[permalink] [raw]
Subject: Re: [00/15] swiotlb cleanup

On Fri, 10 Jul 2009 07:12:36 +0200
Ingo Molnar <[email protected]> wrote:

>
> * FUJITA Tomonori <[email protected]> wrote:
>
> > - removes unused (and unnecessary) hooks in swiotlb.
> >
> > - adds dma_capable() and converts swiotlb to use it. It can be used to
> > know if a memory area is dma capable or not. I added
> > is_buffer_dma_capable() for the same purpose long ago but it turned
> > out that the function doesn't work on POWERPC.
> >
> > This can be applied cleanly to linux-next, -mm, and mainline. This
> > patchset touches multiple architectures (ia64, powerpc, x86) so I
> > guess that -mm is appropriate for this patchset (I don't care much
> > what tree would merge this though).
> >
> > This is tested on x86 but only compile tested on POWERPC and IA64.
> >
> > Thanks,
> >
> > =
> > arch/ia64/include/asm/dma-mapping.h | 18 ++++++
> > arch/powerpc/include/asm/dma-mapping.h | 23 +++++++
> > arch/powerpc/kernel/dma-swiotlb.c | 48 +---------------
> > arch/x86/include/asm/dma-mapping.h | 18 ++++++
> > arch/x86/kernel/pci-dma.c | 2 +-
> > arch/x86/kernel/pci-gart_64.c | 5 +-
> > arch/x86/kernel/pci-nommu.c | 2 +-
> > arch/x86/kernel/pci-swiotlb.c | 25 --------
> > include/linux/dma-mapping.h | 5 --
> > include/linux/swiotlb.h | 11 ----
> > lib/swiotlb.c | 102 +++++++++-----------------------
> > 11 files changed, 92 insertions(+), 167 deletions(-)
>
> Hm, the functions and facilities you remove here were added as part
> of preparatory patches for Xen guest support. You were aware of
> them, you were involved in discussions about those aspects with Ian
> and Jeremy but still you chose not to Cc: either of them and you
> failed to address that aspect in the changelogs.
>
> I'd like the Xen code to become cleaner more than anyone else here i
> guess, but patch submission methods like this are not really
> helpful. A far better method is to be open about such disagreements,
> to declare them, to Cc: everyone who disagrees, and to line out the
> arguments in the changelogs as well - instead of just curtly
> declaring those APIs 'unused' and failing to Cc: involved parties.
>
> Alas, on the technical level the cleanups themselves look mostly
> fine to me. Ian, Jeremy, the changes will alter Xen's use of
> swiotlb, but can the Xen side still live with these new methods - in
> particular is dma_capable() sufficient as a mechanism and can the
> Xen side filter out DMA allocations to make them physically
> continuous?

dma_capable() doesn't work for Xen in the way that Ian hopes.

As I said to him again and again, he tries to use arch code in the
very original way, and it's unacceptable (of course, he disagreed with
it).

I don't think that we need to take account of dom0 support; we don't
have a clear idea about an acceptable dom0 design (it needs to use
swiotlb code? I don't know yet), we don't even know we will have dom0
support in mainline. That's why I didn't CC this patchset to Xen
camp.

I think that it's more reasonable to think about how the code can
works for dom0 support when Xen people come with the new dom0 code.


> Ben, Tony, Becky, any objections wrt. the PowerPC / IA64 impact? If
> everyone agrees i can apply them to the IOMMU tree, test it and push
> it out to -next, etc.
>
> Ingo
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2009-07-10 14:02:08

by Ian Campbell

[permalink] [raw]
Subject: Re: [00/15] swiotlb cleanup

On Fri, 2009-07-10 at 06:12 +0100, Ingo Molnar wrote:
> Hm, the functions and facilities you remove here were added as part
> of preparatory patches for Xen guest support. You were aware of
> them, you were involved in discussions about those aspects with Ian
> and Jeremy but still you chose not to Cc: either of them and you
> failed to address that aspect in the changelogs.

Thanks for adding me Ingo.

> Alas, on the technical level the cleanups themselves look mostly
> fine to me. Ian, Jeremy, the changes will alter Xen's use of
> swiotlb, but can the Xen side still live with these new methods - in
> particular is dma_capable() sufficient as a mechanism and can the
> Xen side filter out DMA allocations to make them physically
> continuous?

I've not examined the series in detail it looks OK but I don't think it
is quite sufficient. The Xen determination of whether a buffer is
dma_capable or not is based on the physical address while dma_capable
takes only the dma address.

I'm not sure we can "invert" our conditions to work back from dma
address to physical since given a start dma address and a length we
would need to check that dma_to_phys(dma+PAGE_SIZE) ==
dma_to_phys(dma)+PAGE_SIZE etc. However dma+PAGE_SIZE might belong to a
different domain so translating it to a physical address in isolation
tells us nothing especially useful since it would give us the physical
address in that other guest which is useless to us. If we could pass
both physical and dma address to dma_capable I think that would probably
be sufficient for our purposes.

As well as that Xen needs some way to influence the allocation of the
actual bounce buffer itself since we need to arrange for it to be
machine address contiguous as well as physical address contiguous. This
series explicitly removes those hooks without replacement. My most
recent proposal was to have a new swiotlb_init variant which was given a
preallocated buffer which this series doesn't necessarily preclude.

The phys_to_dma and dma_to_phys translation points are the last piece
Xen needs and seem to be preserved in this series.

However Fujita's objection to all of the previous swiotlb-for-xen
proposals was around the addition of the Xen hooks in whichever
location. Originally these hooks were via __weak functions and later
proposals implemented them via function pointer hooks in the x86
implementations of the arch-abstract interfaces (phys<->dma and
dma_capable etc). I don't think this series addresses those objections
(fair enough -- it wasn't intended to) or leads to any new approach to
solving the issue, although I also don't think it makes the issue any
harder to address. I don't think it will be possible to make progress on
Xen usage of swiotlb until a solution can be found to this conflict of
opinion.

Fujita suggested that we export the core sync_single() functionality and
reimplemented the surrounding infrastructure in terms of that (and
incorporating our additional requirements). I prototyped this (it is
currently unworking, in fact it seems to have developed rather a taste
for filesystems :-() but the diffstat of my WIP patch is:
arch/x86/kernel/pci-swiotlb.c | 6
arch/x86/xen/pci-swiotlb.c | 2
drivers/pci/xen-iommu.c | 385 ++++++++++++++++++++++++++++++++++++++++--
include/linux/swiotlb.h | 12 +
lib/swiotlb.c | 10 -
5 files changed, 385 insertions(+), 30 deletions(-)
where a fair number of the lines in xen-iommu.c are copies of functions
from swiotlb.c with minor modifications. As I say it doesn't work yet
but I think it's roughly indicative of what such an approach would look
like. I don't like it much but am happy to run with it if it looks to be
the most acceptable approach. To be honest at the moment I've
deliberately been taking some time away from this stuff to try and gain
a bit of perspective so I haven't looked at it in a while.

Ian.

2009-07-10 14:02:24

by Ian Campbell

[permalink] [raw]
Subject: Re: [00/15] swiotlb cleanup

On Fri, 2009-07-10 at 14:35 +0900, FUJITA Tomonori wrote:
> I don't think that we need to take account of dom0 support; we don't
> have a clear idea about an acceptable dom0 design (it needs to use
> swiotlb code? I don't know yet), we don't even know we will have dom0
> support in mainline. That's why I didn't CC this patchset to Xen
> camp.

The core domain 0 patches which were the subject of the discussions a
few week back are completely orthogonal to the swiotlb side of things
and whatever form they eventually take I do not think it will have any
impact on the shape of the solution which we arrive at for swiotlb. I
don't think that assuming that domain 0 can never be done in a way which
everyone finds acceptable and therefore discounting all consideration of
it is a useful way to make progress with these issues.

The DMA use case is much more tightly tied to the paravirtualized MMU
(which is already in the kernel for domU purposes) than it is to "the
domain 0" patches anyway. Although domain 0 is probably the main use
case, at least today, swiotlb support is also used in a Xen domU as part
of the support for direct assignment of PCI devices to paravirtualised
guests (pci passthrough).

The pci frontend driver depends on some bits of the domain 0 physical
interrupt patches as well as swiotlb which is why I/we haven't tried to
upstream that particular series yet.

Ian.

2009-07-10 14:13:30

by Ingo Molnar

[permalink] [raw]
Subject: Re: [00/15] swiotlb cleanup


* Ian Campbell <[email protected]> wrote:

> I've not examined the series in detail it looks OK but I don't
> think it is quite sufficient. The Xen determination of whether a
> buffer is dma_capable or not is based on the physical address
> while dma_capable takes only the dma address.
>
> I'm not sure we can "invert" our conditions to work back from dma
> address to physical since given a start dma address and a length
> we would need to check that dma_to_phys(dma+PAGE_SIZE) ==
> dma_to_phys(dma)+PAGE_SIZE etc. However dma+PAGE_SIZE might belong
> to a different domain so translating it to a physical address in
> isolation tells us nothing especially useful since it would give
> us the physical address in that other guest which is useless to
> us. If we could pass both physical and dma address to dma_capable
> I think that would probably be sufficient for our purposes.
>
> As well as that Xen needs some way to influence the allocation of
> the actual bounce buffer itself since we need to arrange for it to
> be machine address contiguous as well as physical address
> contiguous. This series explicitly removes those hooks without
> replacement. My most recent proposal was to have a new
> swiotlb_init variant which was given a preallocated buffer which
> this series doesn't necessarily preclude.
>
> The phys_to_dma and dma_to_phys translation points are the last
> piece Xen needs and seem to be preserved in this series.
>
> However Fujita's objection to all of the previous swiotlb-for-xen
> proposals was around the addition of the Xen hooks in whichever
> location. Originally these hooks were via __weak functions and
> later proposals implemented them via function pointer hooks in the
> x86 implementations of the arch-abstract interfaces (phys<->dma
> and dma_capable etc). I don't think this series addresses those
> objections (fair enough -- it wasn't intended to) or leads to any
> new approach to solving the issue, although I also don't think it
> makes the issue any harder to address. I don't think it will be
> possible to make progress on Xen usage of swiotlb until a solution
> can be found to this conflict of opinion.
>
> Fujita suggested that we export the core sync_single()
> functionality and reimplemented the surrounding infrastructure in
> terms of that (and incorporating our additional requirements). I
> prototyped this (it is currently unworking, in fact it seems to
> have developed rather a taste for filesystems :-() but the
> diffstat of my WIP patch is:
>
> arch/x86/kernel/pci-swiotlb.c | 6
> arch/x86/xen/pci-swiotlb.c | 2
> drivers/pci/xen-iommu.c | 385 ++++++++++++++++++++++++++++++++++++++++--
> include/linux/swiotlb.h | 12 +
> lib/swiotlb.c | 10 -
> 5 files changed, 385 insertions(+), 30 deletions(-)
>
> where a fair number of the lines in xen-iommu.c are copies of
> functions from swiotlb.c with minor modifications. As I say it
> doesn't work yet but I think it's roughly indicative of what such
> an approach would look like. I don't like it much but am happy to
> run with it if it looks to be the most acceptable approach. [...]

+400 lines of code to avoid much fewer lines of generic code impact
on the lib/swiotlb.c side sounds like a bad technical choice to me.

It makes the swiotlb code less useful and basically forks a random
implementation of it in drivers/pci/xen-iommu.c.

Fujita-san, can you think of a solution that avoids the whole-sale
copying of hundreds of lines of code?

Ingo

2009-07-13 04:22:16

by FUJITA Tomonori

[permalink] [raw]
Subject: Re: [00/15] swiotlb cleanup

On Fri, 10 Jul 2009 15:02:00 +0100
Ian Campbell <[email protected]> wrote:

> On Fri, 2009-07-10 at 14:35 +0900, FUJITA Tomonori wrote:
> > I don't think that we need to take account of dom0 support; we don't
> > have a clear idea about an acceptable dom0 design (it needs to use
> > swiotlb code? I don't know yet), we don't even know we will have dom0
> > support in mainline. That's why I didn't CC this patchset to Xen
> > camp.
>
> The core domain 0 patches which were the subject of the discussions a
> few week back are completely orthogonal to the swiotlb side of things

? If we don't merge dom0 patch, we don't need dom0 changes to
swiotlb. We don't know we would have dom0 support in mainline. Or I
overlooked something?


> and whatever form they eventually take I do not think it will have any
> impact on the shape of the solution which we arrive at for swiotlb. I
> don't think that assuming that domain 0 can never be done in a way which
> everyone finds acceptable and therefore discounting all consideration of
> it is a useful way to make progress with these issues.
>
> The DMA use case is much more tightly tied to the paravirtualized MMU
> (which is already in the kernel for domU purposes) than it is to "the
> domain 0" patches anyway. Although domain 0 is probably the main use
> case, at least today, swiotlb support is also used in a Xen domU as part
> of the support for direct assignment of PCI devices to paravirtualised
> guests (pci passthrough).
>
> The pci frontend driver depends on some bits of the domain 0 physical
> interrupt patches as well as swiotlb which is why I/we haven't tried to
> upstream that particular series yet.

As far as I know, you have not posted anything about changes to
swiotlb for domU. I can't discuss it. If you want, please send
patches.

2009-07-13 04:22:01

by FUJITA Tomonori

[permalink] [raw]
Subject: Re: [00/15] swiotlb cleanup

On Fri, 10 Jul 2009 16:12:48 +0200
Ingo Molnar <[email protected]> wrote:

> > functionality and reimplemented the surrounding infrastructure in
> > terms of that (and incorporating our additional requirements). I
> > prototyped this (it is currently unworking, in fact it seems to
> > have developed rather a taste for filesystems :-() but the
> > diffstat of my WIP patch is:
> >
> > arch/x86/kernel/pci-swiotlb.c | 6
> > arch/x86/xen/pci-swiotlb.c | 2
> > drivers/pci/xen-iommu.c | 385 ++++++++++++++++++++++++++++++++++++++++--
> > include/linux/swiotlb.h | 12 +
> > lib/swiotlb.c | 10 -
> > 5 files changed, 385 insertions(+), 30 deletions(-)
> >
> > where a fair number of the lines in xen-iommu.c are copies of
> > functions from swiotlb.c with minor modifications. As I say it
> > doesn't work yet but I think it's roughly indicative of what such
> > an approach would look like. I don't like it much but am happy to
> > run with it if it looks to be the most acceptable approach. [...]
>
> +400 lines of code to avoid much fewer lines of generic code impact
> on the lib/swiotlb.c side sounds like a bad technical choice to me.

The amount of code is not the point. The way to impact on the
lib/swiotlb.c is totally wrong from the perspective of the kernel
design; it uses architecture code in the very original (xen) way.


> It makes the swiotlb code less useful and basically forks a random
> implementation of it in drivers/pci/xen-iommu.c.

I don't think so. We always have the reasonable amount of duplication
rather than integration in a dirty way (at least about IOMMU
code).


> Fujita-san, can you think of a solution that avoids the whole-sale
> copying of hundreds of lines of code?

Ian didn't give a pointer to his code in a public place so I'm not
sure his claim is valid. But if his code does the right thing and
the duplication is less than 400 lines, it doesn't sound that bad to
me.

2009-07-13 09:20:05

by FUJITA Tomonori

[permalink] [raw]
Subject: Re: [00/15] swiotlb cleanup

On Mon, 13 Jul 2009 13:20:22 +0900
FUJITA Tomonori <[email protected]> wrote:

> On Fri, 10 Jul 2009 16:12:48 +0200
> Ingo Molnar <[email protected]> wrote:
>
> > > functionality and reimplemented the surrounding infrastructure in
> > > terms of that (and incorporating our additional requirements). I
> > > prototyped this (it is currently unworking, in fact it seems to
> > > have developed rather a taste for filesystems :-() but the
> > > diffstat of my WIP patch is:
> > >
> > > arch/x86/kernel/pci-swiotlb.c | 6
> > > arch/x86/xen/pci-swiotlb.c | 2
> > > drivers/pci/xen-iommu.c | 385 ++++++++++++++++++++++++++++++++++++++++--
> > > include/linux/swiotlb.h | 12 +
> > > lib/swiotlb.c | 10 -
> > > 5 files changed, 385 insertions(+), 30 deletions(-)
> > >
> > > where a fair number of the lines in xen-iommu.c are copies of
> > > functions from swiotlb.c with minor modifications. As I say it
> > > doesn't work yet but I think it's roughly indicative of what such
> > > an approach would look like. I don't like it much but am happy to
> > > run with it if it looks to be the most acceptable approach. [...]
> >
> > +400 lines of code to avoid much fewer lines of generic code impact
> > on the lib/swiotlb.c side sounds like a bad technical choice to me.
>
> The amount of code is not the point. The way to impact on the
> lib/swiotlb.c is totally wrong from the perspective of the kernel
> design; it uses architecture code in the very original (xen) way.

btw, '+400 lines of code to avoid much fewer lines of generic code
impact on the lib/swiotlb.c' doesn't sound true to me.

Here is a patch in the way that Xen people want to do:

http://patchwork.kernel.org/patch/26343/

---
arch/x86/Kconfig | 4 +
arch/x86/include/asm/io.h | 2 +
arch/x86/include/asm/pci_x86.h | 1 +
arch/x86/include/asm/xen/iommu.h | 12 ++
arch/x86/kernel/pci-dma.c | 3 +
arch/x86/pci/Makefile | 1 +
arch/x86/pci/init.c | 6 +
arch/x86/pci/xen.c | 51 +++++++
drivers/pci/Makefile | 2 +
drivers/pci/xen-iommu.c | 271 ++++++++++++++++++++++++++++++++++++++


Even with the way that Xen people want to do, drivers/pci/xen-iommu.c
is about 300 lines. And my patchset removes the nice amount of lines
for dom0 support. I don't see much difference wrt lines.

2009-07-13 09:40:34

by Ian Campbell

[permalink] [raw]
Subject: Re: [00/15] swiotlb cleanup

On Mon, 2009-07-13 at 05:20 +0100, FUJITA Tomonori wrote:
> On Fri, 10 Jul 2009 15:02:00 +0100
> Ian Campbell <[email protected]> wrote:
>
> > On Fri, 2009-07-10 at 14:35 +0900, FUJITA Tomonori wrote:
> > > I don't think that we need to take account of dom0 support; we don't
> > > have a clear idea about an acceptable dom0 design (it needs to use
> > > swiotlb code? I don't know yet), we don't even know we will have dom0
> > > support in mainline. That's why I didn't CC this patchset to Xen
> > > camp.
> >
> > The core domain 0 patches which were the subject of the discussions a
> > few week back are completely orthogonal to the swiotlb side of things
>
> ? If we don't merge dom0 patch, we don't need dom0 changes to
> swiotlb. We don't know we would have dom0 support in mainline. Or I
> overlooked something?
[...]
> As far as I know, you have not posted anything about changes to
> swiotlb for domU. I can't discuss it. If you want, please send
> patches.

There are no separate domU swiotlb patches -- the exact the same patches
as we have already been discussing are useful and necessary for both
domU and dom0.

Ian.

2009-07-13 09:54:48

by FUJITA Tomonori

[permalink] [raw]
Subject: Re: [00/15] swiotlb cleanup

On Mon, 13 Jul 2009 10:40:29 +0100
Ian Campbell <[email protected]> wrote:

> On Mon, 2009-07-13 at 05:20 +0100, FUJITA Tomonori wrote:
> > On Fri, 10 Jul 2009 15:02:00 +0100
> > Ian Campbell <[email protected]> wrote:
> >
> > > On Fri, 2009-07-10 at 14:35 +0900, FUJITA Tomonori wrote:
> > > > I don't think that we need to take account of dom0 support; we don't
> > > > have a clear idea about an acceptable dom0 design (it needs to use
> > > > swiotlb code? I don't know yet), we don't even know we will have dom0
> > > > support in mainline. That's why I didn't CC this patchset to Xen
> > > > camp.
> > >
> > > The core domain 0 patches which were the subject of the discussions a
> > > few week back are completely orthogonal to the swiotlb side of things
> >
> > ? If we don't merge dom0 patch, we don't need dom0 changes to
> > swiotlb. We don't know we would have dom0 support in mainline. Or I
> > overlooked something?
> [...]
> > As far as I know, you have not posted anything about changes to
> > swiotlb for domU. I can't discuss it. If you want, please send
> > patches.
>
> There are no separate domU swiotlb patches -- the exact the same patches
> as we have already been discussing are useful and necessary for both
> domU and dom0.

Hmm, you guys introduced the swiotlb hooks by saying that it's for
only dom0.

=
http://lkml.org/lkml/2008/12/16/351

Hi Ingo,

Here's some patches to clean up swiotlb to prepare for some Xen dom0
patches. These have been posted before, but undergone a round of
cleanups to address various comments.

=

I don't see any comments like 'this is useful to dom0 too'. I'm still
not sure what exactly part is useful to domU.

2009-07-13 10:06:03

by Ian Campbell

[permalink] [raw]
Subject: Re: [00/15] swiotlb cleanup

On Mon, 2009-07-13 at 18:53 +0900, FUJITA Tomonori wrote:
> On Mon, 13 Jul 2009 10:40:29 +0100
> Ian Campbell <[email protected]> wrote:
>
> > On Mon, 2009-07-13 at 05:20 +0100, FUJITA Tomonori wrote:
> > > On Fri, 10 Jul 2009 15:02:00 +0100
> > > Ian Campbell <[email protected]> wrote:
> > >
> > > > On Fri, 2009-07-10 at 14:35 +0900, FUJITA Tomonori wrote:
> > > > > I don't think that we need to take account of dom0 support; we don't
> > > > > have a clear idea about an acceptable dom0 design (it needs to use
> > > > > swiotlb code? I don't know yet), we don't even know we will have dom0
> > > > > support in mainline. That's why I didn't CC this patchset to Xen
> > > > > camp.
> > > >
> > > > The core domain 0 patches which were the subject of the discussions a
> > > > few week back are completely orthogonal to the swiotlb side of things
> > >
> > > ? If we don't merge dom0 patch, we don't need dom0 changes to
> > > swiotlb. We don't know we would have dom0 support in mainline. Or I
> > > overlooked something?
> > [...]
> > > As far as I know, you have not posted anything about changes to
> > > swiotlb for domU. I can't discuss it. If you want, please send
> > > patches.
> >
> > There are no separate domU swiotlb patches -- the exact the same patches
> > as we have already been discussing are useful and necessary for both
> > domU and dom0.
>
> Hmm, you guys introduced the swiotlb hooks by saying that it's for
> only dom0.

That was just sloppy wording on our part. domain 0 is the major usecase
today so there is a tendency to think in those terms but the changes are
actually relevant to any domain with access to a physical device that
can do DMA, this includes domU via PCI passthrough.

> I don't see any comments like 'this is useful to dom0 too'. I'm still
^U?
> not sure what exactly part is useful to domU.

All of it...

Ian.

2009-07-14 02:23:56

by Becky Bruce

[permalink] [raw]
Subject: Re: [PATCH 04/15] swiotlb: remove unnecessary swiotlb_bus_to_virt


On Jul 9, 2009, at 8:04 PM, FUJITA Tomonori wrote:

> swiotlb_bus_to_virt is unncessary; we can use swiotlb_bus_to_phys and
> phys_to_virt instead.

phys_to_virt (also, virt_to_phys) is invalid for highmem addresses on
ppc. In most of the uses in this file, it doesn't matter, as the
iotlb buffers themselves are alloc'd out of lowmem, but the
dma_mark_clean() calls could happen on a highmem addr. Currently, on
powerpc, dma_mark_clean() doesn't do anything, so it isn't a
functional problem. I'm fine with the bulk of this patch, because in
reality, if I need to do something with a virtual address of a highmem
page, I have to get a kmap for the page. So the existing
swiotlb_bus_to_virt isn't really helping.

What is dma_mark_clean used for? Could it be made to take a paddr,
and let the implementation deal with making sure there's a valid vaddr
for the actual "clean" operation?

I'd also like to see a comment with swiotlb_virt_to_bus() that notes
that it should only be called on addresses in lowmem.

Cheers,
Becky

>
>
> Signed-off-by: FUJITA Tomonori <[email protected]>
> ---
> arch/powerpc/kernel/dma-swiotlb.c | 10 ----------
> lib/swiotlb.c | 34 +++++++++++++++
> +------------------
> 2 files changed, 16 insertions(+), 28 deletions(-)
>
> diff --git a/arch/powerpc/kernel/dma-swiotlb.c b/arch/powerpc/kernel/
> dma-swiotlb.c
> index 68ccf11..41534ae 100644
> --- a/arch/powerpc/kernel/dma-swiotlb.c
> +++ b/arch/powerpc/kernel/dma-swiotlb.c
> @@ -24,16 +24,6 @@
> int swiotlb __read_mostly;
> unsigned int ppc_swiotlb_enable;
>
> -void *swiotlb_bus_to_virt(struct device *hwdev, dma_addr_t addr)
> -{
> - unsigned long pfn = PFN_DOWN(swiotlb_bus_to_phys(hwdev, addr));
> - void *pageaddr = page_address(pfn_to_page(pfn));
> -
> - if (pageaddr != NULL)
> - return pageaddr + (addr % PAGE_SIZE);
> - return NULL;
> -}
> -
> dma_addr_t swiotlb_phys_to_bus(struct device *hwdev, phys_addr_t
> paddr)
> {
> return paddr + get_dma_direct_offset(hwdev);
> diff --git a/lib/swiotlb.c b/lib/swiotlb.c
> index dc1cd1f..1a89c84 100644
> --- a/lib/swiotlb.c
> +++ b/lib/swiotlb.c
> @@ -130,11 +130,6 @@ static dma_addr_t swiotlb_virt_to_bus(struct
> device *hwdev,
> return swiotlb_phys_to_bus(hwdev, virt_to_phys(address));
> }
>
> -void * __weak swiotlb_bus_to_virt(struct device *hwdev, dma_addr_t
> address)
> -{
> - return phys_to_virt(swiotlb_bus_to_phys(hwdev, address));
> -}
> -
> int __weak swiotlb_arch_address_needs_mapping(struct device *hwdev,
> dma_addr_t addr, size_t size)
> {
> @@ -307,9 +302,10 @@ address_needs_mapping(struct device *hwdev,
> dma_addr_t addr, size_t size)
> return swiotlb_arch_address_needs_mapping(hwdev, addr, size);
> }
>
> -static int is_swiotlb_buffer(char *addr)
> +static int is_swiotlb_buffer(phys_addr_t paddr)
> {
> - return addr >= io_tlb_start && addr < io_tlb_end;
> + return paddr >= virt_to_phys(io_tlb_start) &&
> + paddr < virt_to_phys(io_tlb_end);
> }
>
> /*
> @@ -582,11 +578,13 @@ EXPORT_SYMBOL(swiotlb_alloc_coherent);
>
> void
> swiotlb_free_coherent(struct device *hwdev, size_t size, void *vaddr,
> - dma_addr_t dma_handle)
> + dma_addr_t dev_addr)
> {
> + phys_addr_t paddr = swiotlb_bus_to_phys(hwdev, dev_addr);
> +
> WARN_ON(irqs_disabled());
> - if (!is_swiotlb_buffer(vaddr))
> - free_pages((unsigned long) vaddr, get_order(size));
> + if (!is_swiotlb_buffer(paddr))
> + free_pages((unsigned long)vaddr, get_order(size));
> else
> /* DMA_TO_DEVICE to avoid memcpy in unmap_single */
> do_unmap_single(hwdev, vaddr, size, DMA_TO_DEVICE);
> @@ -671,19 +669,19 @@ EXPORT_SYMBOL_GPL(swiotlb_map_page);
> static void unmap_single(struct device *hwdev, dma_addr_t dev_addr,
> size_t size, int dir)
> {
> - char *dma_addr = swiotlb_bus_to_virt(hwdev, dev_addr);
> + phys_addr_t paddr = swiotlb_bus_to_phys(hwdev, dev_addr);
>
> BUG_ON(dir == DMA_NONE);
>
> - if (is_swiotlb_buffer(dma_addr)) {
> - do_unmap_single(hwdev, dma_addr, size, dir);
> + if (is_swiotlb_buffer(paddr)) {
> + do_unmap_single(hwdev, phys_to_virt(paddr), size, dir);
> return;
> }
>
> if (dir != DMA_FROM_DEVICE)
> return;
>
> - dma_mark_clean(dma_addr, size);
> + dma_mark_clean(phys_to_virt(paddr), size);
> }
>
> void swiotlb_unmap_page(struct device *hwdev, dma_addr_t dev_addr,
> @@ -708,19 +706,19 @@ static void
> swiotlb_sync_single(struct device *hwdev, dma_addr_t dev_addr,
> size_t size, int dir, int target)
> {
> - char *dma_addr = swiotlb_bus_to_virt(hwdev, dev_addr);
> + phys_addr_t paddr = swiotlb_bus_to_phys(hwdev, dev_addr);
>
> BUG_ON(dir == DMA_NONE);
>
> - if (is_swiotlb_buffer(dma_addr)) {
> - sync_single(hwdev, dma_addr, size, dir, target);
> + if (is_swiotlb_buffer(paddr)) {
> + sync_single(hwdev, phys_to_virt(paddr), size, dir, target);
> return;
> }
>
> if (dir != DMA_FROM_DEVICE)
> return;
>
> - dma_mark_clean(dma_addr, size);
> + dma_mark_clean(phys_to_virt(paddr), size);

>
> }
>
> void
> --
> 1.6.0.6
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-
> kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2009-07-14 03:15:46

by Becky Bruce

[permalink] [raw]
Subject: Re: [00/15] swiotlb cleanup


On Jul 10, 2009, at 12:12 AM, Ingo Molnar wrote:

>
> * FUJITA Tomonori <[email protected]> wrote:
>
>> - removes unused (and unnecessary) hooks in swiotlb.
>>
>> - adds dma_capable() and converts swiotlb to use it. It can be used
>> to
>> know if a memory area is dma capable or not. I added
>> is_buffer_dma_capable() for the same purpose long ago but it turned
>> out that the function doesn't work on POWERPC.
>>
>> This can be applied cleanly to linux-next, -mm, and mainline. This
>> patchset touches multiple architectures (ia64, powerpc, x86) so I
>> guess that -mm is appropriate for this patchset (I don't care much
>> what tree would merge this though).
>>
>> This is tested on x86 but only compile tested on POWERPC and IA64.
>>
>> Thanks,
>>
>> =
>> arch/ia64/include/asm/dma-mapping.h | 18 ++++++
>> arch/powerpc/include/asm/dma-mapping.h | 23 +++++++
>> arch/powerpc/kernel/dma-swiotlb.c | 48 +---------------
>> arch/x86/include/asm/dma-mapping.h | 18 ++++++
>> arch/x86/kernel/pci-dma.c | 2 +-
>> arch/x86/kernel/pci-gart_64.c | 5 +-
>> arch/x86/kernel/pci-nommu.c | 2 +-
>> arch/x86/kernel/pci-swiotlb.c | 25 --------
>> include/linux/dma-mapping.h | 5 --
>> include/linux/swiotlb.h | 11 ----
>> lib/swiotlb.c | 102 ++++++++
>> +-----------------------
>> 11 files changed, 92 insertions(+), 167 deletions(-)
>
> Hm, the functions and facilities you remove here were added as part
> of preparatory patches for Xen guest support. You were aware of
> them, you were involved in discussions about those aspects with Ian
> and Jeremy but still you chose not to Cc: either of them and you
> failed to address that aspect in the changelogs.
>
> I'd like the Xen code to become cleaner more than anyone else here i
> guess, but patch submission methods like this are not really
> helpful. A far better method is to be open about such disagreements,
> to declare them, to Cc: everyone who disagrees, and to line out the
> arguments in the changelogs as well - instead of just curtly
> declaring those APIs 'unused' and failing to Cc: involved parties.
>
> Alas, on the technical level the cleanups themselves look mostly
> fine to me. Ian, Jeremy, the changes will alter Xen's use of
> swiotlb, but can the Xen side still live with these new methods - in
> particular is dma_capable() sufficient as a mechanism and can the
> Xen side filter out DMA allocations to make them physically
> continuous?
>
> Ben, Tony, Becky, any objections wrt. the PowerPC / IA64 impact? If
> everyone agrees i can apply them to the IOMMU tree, test it and push
> it out to -next, etc.
>

Ingo,

With the exception of the patch I commented on, I think these look OK
from the powerpc point of view. I've successfully booted one of my
test platforms with the entire series applied and will run some more
extensive (i.e. not "Whee! A prompt!") tests tomorrow.

-Becky

2009-07-14 05:09:59

by FUJITA Tomonori

[permalink] [raw]
Subject: Re: [PATCH 04/15] swiotlb: remove unnecessary swiotlb_bus_to_virt

On Mon, 13 Jul 2009 21:17:21 -0500
Becky Bruce <[email protected]> wrote:

>
> On Jul 9, 2009, at 8:04 PM, FUJITA Tomonori wrote:
>
> > swiotlb_bus_to_virt is unncessary; we can use swiotlb_bus_to_phys and
> > phys_to_virt instead.
>
> phys_to_virt (also, virt_to_phys) is invalid for highmem addresses on
> ppc. In most of the uses in this file, it doesn't matter, as the
> iotlb buffers themselves are alloc'd out of lowmem,

Right,

> but the
> dma_mark_clean() calls could happen on a highmem addr. Currently, on
> powerpc, dma_mark_clean() doesn't do anything, so it isn't a
> functional problem.

Oops, I overlooked this. However, as you said, this is not a problem
with the current code.


> I'm fine with the bulk of this patch, because in
> reality, if I need to do something with a virtual address of a highmem
> page, I have to get a kmap for the page. So the existing
> swiotlb_bus_to_virt isn't really helping.
>
> What is dma_mark_clean used for? Could it be made to take a paddr,
> and let the implementation deal with making sure there's a valid vaddr
> for the actual "clean" operation?

I think that it's IA64's optimization (it's a NULL function on X86
like POWERPC). It's defined in arch/ia64/mm/init.c. Looks like POWERPC
could use this optimization, I guess.

dma_mark_clean() just modifies the page flag (what it needs is struct
page) so I think that we can make it take a paddr.


> I'd also like to see a comment with swiotlb_virt_to_bus() that notes
> that it should only be called on addresses in lowmem.

Ok, I'll do.


Thanks,

2009-07-15 20:26:08

by Becky Bruce

[permalink] [raw]
Subject: Re: [00/15] swiotlb cleanup


On Jul 13, 2009, at 10:13 PM, Becky Bruce wrote:

>
> On Jul 10, 2009, at 12:12 AM, Ingo Molnar wrote:
>
>>
>> * FUJITA Tomonori <[email protected]> wrote:
>>
>>> - removes unused (and unnecessary) hooks in swiotlb.
>>>
>>> - adds dma_capable() and converts swiotlb to use it. It can be
>>> used to
>>> know if a memory area is dma capable or not. I added
>>> is_buffer_dma_capable() for the same purpose long ago but it turned
>>> out that the function doesn't work on POWERPC.
>>>
>>> This can be applied cleanly to linux-next, -mm, and mainline. This
>>> patchset touches multiple architectures (ia64, powerpc, x86) so I
>>> guess that -mm is appropriate for this patchset (I don't care much
>>> what tree would merge this though).
>>>
>>> This is tested on x86 but only compile tested on POWERPC and IA64.
>>>
>>> Thanks,
>>>
>>> =
>>> arch/ia64/include/asm/dma-mapping.h | 18 ++++++
>>> arch/powerpc/include/asm/dma-mapping.h | 23 +++++++
>>> arch/powerpc/kernel/dma-swiotlb.c | 48 +---------------
>>> arch/x86/include/asm/dma-mapping.h | 18 ++++++
>>> arch/x86/kernel/pci-dma.c | 2 +-
>>> arch/x86/kernel/pci-gart_64.c | 5 +-
>>> arch/x86/kernel/pci-nommu.c | 2 +-
>>> arch/x86/kernel/pci-swiotlb.c | 25 --------
>>> include/linux/dma-mapping.h | 5 --
>>> include/linux/swiotlb.h | 11 ----
>>> lib/swiotlb.c | 102 ++++++++
>>> +-----------------------
>>> 11 files changed, 92 insertions(+), 167 deletions(-)
>>
>> Hm, the functions and facilities you remove here were added as part
>> of preparatory patches for Xen guest support. You were aware of
>> them, you were involved in discussions about those aspects with Ian
>> and Jeremy but still you chose not to Cc: either of them and you
>> failed to address that aspect in the changelogs.
>>
>> I'd like the Xen code to become cleaner more than anyone else here i
>> guess, but patch submission methods like this are not really
>> helpful. A far better method is to be open about such disagreements,
>> to declare them, to Cc: everyone who disagrees, and to line out the
>> arguments in the changelogs as well - instead of just curtly
>> declaring those APIs 'unused' and failing to Cc: involved parties.
>>
>> Alas, on the technical level the cleanups themselves look mostly
>> fine to me. Ian, Jeremy, the changes will alter Xen's use of
>> swiotlb, but can the Xen side still live with these new methods - in
>> particular is dma_capable() sufficient as a mechanism and can the
>> Xen side filter out DMA allocations to make them physically
>> continuous?
>>
>> Ben, Tony, Becky, any objections wrt. the PowerPC / IA64 impact? If
>> everyone agrees i can apply them to the IOMMU tree, test it and push
>> it out to -next, etc.
>>
>
> Ingo,
>
> With the exception of the patch I commented on, I think these look
> OK from the powerpc point of view. I've successfully booted one of
> my test platforms with the entire series applied and will run some
> more extensive (i.e. not "Whee! A prompt!") tests tomorrow.

Well, I am still testing. I've observed one unexpected LTP testcase
failure with these patches applied, but so far have been unable to
reproduce it. So these patches are probably OK, but I will look into
this some more next week.

-Becky

>
>
> -Becky
>
> _______________________________________________
> Linuxppc-dev mailing list
> [email protected]
> https://lists.ozlabs.org/listinfo/linuxppc-dev

2009-07-16 03:41:34

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: [PATCH 04/15] swiotlb: remove unnecessary swiotlb_bus_to_virt

On Mon, 2009-07-13 at 21:17 -0500, Becky Bruce wrote:
>
> phys_to_virt (also, virt_to_phys) is invalid for highmem addresses on
> ppc.

I would be surprised if x86 was any different here. Most of our highmem
implementation comes straight from x86 :-)

Cheers,
Ben.

2009-07-18 10:42:58

by Ingo Molnar

[permalink] [raw]
Subject: Re: [00/15] swiotlb cleanup


* FUJITA Tomonori <[email protected]> wrote:

> On Mon, 13 Jul 2009 13:20:22 +0900
> FUJITA Tomonori <[email protected]> wrote:
>
> > On Fri, 10 Jul 2009 16:12:48 +0200
> > Ingo Molnar <[email protected]> wrote:
> >
> > > > functionality and reimplemented the surrounding infrastructure in
> > > > terms of that (and incorporating our additional requirements). I
> > > > prototyped this (it is currently unworking, in fact it seems to
> > > > have developed rather a taste for filesystems :-() but the
> > > > diffstat of my WIP patch is:
> > > >
> > > > arch/x86/kernel/pci-swiotlb.c | 6
> > > > arch/x86/xen/pci-swiotlb.c | 2
> > > > drivers/pci/xen-iommu.c | 385 ++++++++++++++++++++++++++++++++++++++++--
> > > > include/linux/swiotlb.h | 12 +
> > > > lib/swiotlb.c | 10 -
> > > > 5 files changed, 385 insertions(+), 30 deletions(-)
> > > >
> > > > where a fair number of the lines in xen-iommu.c are copies of
> > > > functions from swiotlb.c with minor modifications. As I say it
> > > > doesn't work yet but I think it's roughly indicative of what such
> > > > an approach would look like. I don't like it much but am happy to
> > > > run with it if it looks to be the most acceptable approach. [...]
> > >
> > > +400 lines of code to avoid much fewer lines of generic code impact
> > > on the lib/swiotlb.c side sounds like a bad technical choice to me.
> >
> > The amount of code is not the point. The way to impact on the
> > lib/swiotlb.c is totally wrong from the perspective of the kernel
> > design; it uses architecture code in the very original (xen) way.
>
> btw, '+400 lines of code to avoid much fewer lines of generic code
> impact on the lib/swiotlb.c' doesn't sound true to me.
>
> Here is a patch in the way that Xen people want to do:
>
> http://patchwork.kernel.org/patch/26343/
>
> ---
> arch/x86/Kconfig | 4 +
> arch/x86/include/asm/io.h | 2 +
> arch/x86/include/asm/pci_x86.h | 1 +
> arch/x86/include/asm/xen/iommu.h | 12 ++
> arch/x86/kernel/pci-dma.c | 3 +
> arch/x86/pci/Makefile | 1 +
> arch/x86/pci/init.c | 6 +
> arch/x86/pci/xen.c | 51 +++++++
> drivers/pci/Makefile | 2 +
> drivers/pci/xen-iommu.c | 271 ++++++++++++++++++++++++++++++++++++++
>
> Even with the way that Xen people want to do,
> drivers/pci/xen-iommu.c is about 300 lines. And my patchset
> removes the nice amount of lines for dom0 support. I don't see
> much difference wrt lines.

ok, that kind of impact looks reasonable. If we are wrong and the
Xen model becomes duplicated anywhere else it can still be
generalized into core swiotlb code.

Ingo