2014-11-12 11:40:11

by Stefano Stabellini

[permalink] [raw]
Subject: [PATCH v9 0/13] introduce GNTTABOP_cache_flush

Hi all,
this patch series introduces support for GNTTABOP_cache_flush to perform
cache maintenance operation on foreign pages and reverts the current
code based on XENFEAT_grant_map_identity.

It also provides a very slow fallback by bouncing on the swiotlb buffer,
in case the hypercall is not available.

v7 introduces a flag named dma_coherent in dev_archdata and an accessor
function named is_device_dma_coherent in asm/dma-mapping.h under arm and
arm64.


Changes in v9:
- remove BUG_ON from the loop in dma_cache_maint;
- add static inline for xen_dma_unmap_page, xen_dma_sync_single_for_cpu,
xen_dma_sync_single_for_device;
- map_page is always present, don't check whether it's implemented;
- use bool local for clarity;
- add an in-code comment.

Changes in v8:
- remove code duplication in mm32.c by moving the pfn_valid check in
page-coherent.h;
- add a dma_addr_t dev_addr argument to xen_dma_map_page;
- use hypercall to flush caches in map_page;
- pass dev_addr to xen_dma_unmap_page and xen_dma_sync_single_for_cpu;
- remove BUG_ON in xen_bus_to_phys.

Changes in v7:
- rebased on 3.18-rc1;
- rename is_dma_coherent to is_device_dma_coherent and move it to
asm/dma-mapping.h on arm and arm64;
- introduce a dma_coherent flag on arm and arm64;
- set the flags from set_arch_dma_coherent_ops.

Changes in v6:
- rename xen_is_dma_coherent to is_dma_coherent;
- use of_dma_is_coherent to detect the dma coherency of a device;
- remove arch/arm64/include/asm/xen/page-coherent.h, include
../../arm/include/asm/xen/page-coherent.h instead;
- fix indentation.

Changes in v5:
- introduce xen_is_dma_coherent as a xen specific function;
- call xen_is_dma_coherent instead of is_dma_coherent;
- introduce xen_is_dma_coherent to
arch/arm64/include/asm/xen/page-coherent.h;
- do not remove arch/arm64/include/asm/xen/page-coherent.h, add
the missing dma_ops calls from it;
- fix indentation;
- rename hypercall_flush to hypercall_cflush;
- remove spurious change.

Changes in v4:
- remove outer_*_range call;
- introduce is_dma_coherent;
- use is_dma_coherent in arch/arm/xen/mm32.c;
- merge xen/mm32.c into xen/mm.c;
- xen/arm/arm64: introduce xen_arch_need_swiotlb;
- avoid bouncing dma map operations that involve foreign grants and
non-coherent devices if GNTTABOP_cache_flush is provided by Xen.

Changes in v3:
- fix the cache maintenance op call to match what Linux does natively;
- update the hypercall interface to match Xen.

Changes in v2:
- remove the definition of XENFEAT_grant_map_identity;
- update the hypercall interface to match Xen;
- call the interface on a single page at a time.



git://git.kernel.org/pub/scm/linux/kernel/git/sstabellini/xen.git cache_flush_v9

Stefano Stabellini (13):
xen/arm: remove handling of XENFEAT_grant_map_identity
xen/arm: remove outer_*_range call
xen/arm: if(pfn_valid(pfn)) call native dma_ops
arm64: introduce is_device_dma_coherent
arm: introduce is_device_dma_coherent
xen/arm: use is_device_dma_coherent
xen: add a dma_addr_t dev_addr argument to xen_dma_map_page
xen/arm: use hypercall to flush caches in map_page
xen/arm/arm64: merge xen/mm32.c into xen/mm.c
xen/arm/arm64: introduce xen_arch_need_swiotlb
xen/arm: introduce GNTTABOP_cache_flush
swiotlb-xen: pass dev_addr to xen_dma_unmap_page and xen_dma_sync_single_for_cpu
swiotlb-xen: remove BUG_ON in xen_bus_to_phys

arch/arm/include/asm/device.h | 1 +
arch/arm/include/asm/dma-mapping.h | 6 +
arch/arm/include/asm/xen/page-coherent.h | 66 +++++++--
arch/arm/include/asm/xen/page.h | 4 +
arch/arm/xen/Makefile | 2 +-
arch/arm/xen/enlighten.c | 5 -
arch/arm/xen/mm.c | 121 +++++++++++++++++
arch/arm/xen/mm32.c | 202 ----------------------------
arch/arm64/include/asm/device.h | 1 +
arch/arm64/include/asm/dma-mapping.h | 6 +
arch/arm64/include/asm/xen/page-coherent.h | 44 +-----
arch/x86/include/asm/xen/page-coherent.h | 4 +-
arch/x86/include/asm/xen/page.h | 7 +
drivers/xen/swiotlb-xen.c | 19 +--
include/xen/interface/features.h | 3 -
include/xen/interface/grant_table.h | 19 +++
16 files changed, 237 insertions(+), 273 deletions(-)
delete mode 100644 arch/arm/xen/mm32.c


2014-11-12 11:41:17

by Stefano Stabellini

[permalink] [raw]
Subject: [PATCH v9 04/13] arm64: introduce is_device_dma_coherent

Introduce a boolean flag and an accessor function to check whether a
device is dma_coherent. Set the flag from set_arch_dma_coherent_ops.

Signed-off-by: Stefano Stabellini <[email protected]>
Signed-off-by: Catalin Marinas <[email protected]>
Reviewed-by: Catalin Marinas <[email protected]>
---
arch/arm64/include/asm/device.h | 1 +
arch/arm64/include/asm/dma-mapping.h | 6 ++++++
2 files changed, 7 insertions(+)

diff --git a/arch/arm64/include/asm/device.h b/arch/arm64/include/asm/device.h
index cf98b36..243ef25 100644
--- a/arch/arm64/include/asm/device.h
+++ b/arch/arm64/include/asm/device.h
@@ -21,6 +21,7 @@ struct dev_archdata {
#ifdef CONFIG_IOMMU_API
void *iommu; /* private IOMMU data */
#endif
+ bool dma_coherent;
};

struct pdev_archdata {
diff --git a/arch/arm64/include/asm/dma-mapping.h b/arch/arm64/include/asm/dma-mapping.h
index adeae3f..e213dc4 100644
--- a/arch/arm64/include/asm/dma-mapping.h
+++ b/arch/arm64/include/asm/dma-mapping.h
@@ -54,11 +54,17 @@ static inline void set_dma_ops(struct device *dev, struct dma_map_ops *ops)

static inline int set_arch_dma_coherent_ops(struct device *dev)
{
+ dev->archdata.dma_coherent = true;
set_dma_ops(dev, &coherent_swiotlb_dma_ops);
return 0;
}
#define set_arch_dma_coherent_ops set_arch_dma_coherent_ops

+static inline bool is_device_dma_coherent(struct device *dev)
+{
+ return dev->archdata.dma_coherent;
+}
+
#include <asm-generic/dma-mapping-common.h>

static inline dma_addr_t phys_to_dma(struct device *dev, phys_addr_t paddr)
--
1.7.10.4

2014-11-12 11:41:21

by Stefano Stabellini

[permalink] [raw]
Subject: [PATCH v9 06/13] xen/arm: use is_device_dma_coherent

Use is_device_dma_coherent to check whether we need to issue cache
maintenance operations rather than checking on the existence of a
particular dma_ops function for the device.

This is correct because coherent devices don't need cache maintenance
operations - arm_coherent_dma_ops does not set the hooks that we
were previously checking for existance.

Signed-off-by: Stefano Stabellini <[email protected]>
Reviewed-by: Catalin Marinas <[email protected]>
Acked-by: Ian Campbell <[email protected]>
---
arch/arm/xen/mm32.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/arm/xen/mm32.c b/arch/arm/xen/mm32.c
index 5bb8391..3ce9dc1 100644
--- a/arch/arm/xen/mm32.c
+++ b/arch/arm/xen/mm32.c
@@ -48,7 +48,7 @@ void __xen_dma_unmap_page(struct device *hwdev, dma_addr_t handle,
struct dma_attrs *attrs)

{
- if (!__generic_dma_ops(hwdev)->unmap_page)
+ if (is_device_dma_coherent(hwdev))
return;
if (dma_get_attr(DMA_ATTR_SKIP_CPU_SYNC, attrs))
return;
@@ -59,7 +59,7 @@ void __xen_dma_unmap_page(struct device *hwdev, dma_addr_t handle,
void __xen_dma_sync_single_for_cpu(struct device *hwdev,
dma_addr_t handle, size_t size, enum dma_data_direction dir)
{
- if (!__generic_dma_ops(hwdev)->sync_single_for_cpu)
+ if (is_device_dma_coherent(hwdev))
return;
__xen_dma_page_dev_to_cpu(hwdev, handle, size, dir);
}
@@ -67,7 +67,7 @@ void __xen_dma_sync_single_for_cpu(struct device *hwdev,
void __xen_dma_sync_single_for_device(struct device *hwdev,
dma_addr_t handle, size_t size, enum dma_data_direction dir)
{
- if (!__generic_dma_ops(hwdev)->sync_single_for_device)
+ if (is_device_dma_coherent(hwdev))
return;
__xen_dma_page_cpu_to_dev(hwdev, handle, size, dir);
}
--
1.7.10.4

2014-11-12 11:41:35

by Stefano Stabellini

[permalink] [raw]
Subject: [PATCH v9 08/13] xen/arm: use hypercall to flush caches in map_page

In xen_dma_map_page, if the page is a local page, call the native
map_page dma_ops. If the page is foreign, call __xen_dma_map_page that
issues any required cache maintenane operations via hypercall.

The reason for doing this is that the native dma_ops map_page could
allocate buffers than need to be freed. If the page is foreign we don't
call the native unmap_page dma_ops function, resulting in a memory leak.

Suggested-by: Catalin Marinas <[email protected]>
Signed-off-by: Stefano Stabellini <[email protected]>
Reviewed-by: Catalin Marinas <[email protected]>

---

Changes in v9:
- map_page is always present;
- use bool local for clarity;
- add a comment.
---
arch/arm/include/asm/xen/page-coherent.h | 13 ++++++++++++-
arch/arm/xen/mm32.c | 12 ++++++++++++
2 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/arch/arm/include/asm/xen/page-coherent.h b/arch/arm/include/asm/xen/page-coherent.h
index a309f42..efd5624 100644
--- a/arch/arm/include/asm/xen/page-coherent.h
+++ b/arch/arm/include/asm/xen/page-coherent.h
@@ -5,6 +5,9 @@
#include <linux/dma-attrs.h>
#include <linux/dma-mapping.h>

+void __xen_dma_map_page(struct device *hwdev, struct page *page,
+ dma_addr_t dev_addr, unsigned long offset, size_t size,
+ enum dma_data_direction dir, struct dma_attrs *attrs);
void __xen_dma_unmap_page(struct device *hwdev, dma_addr_t handle,
size_t size, enum dma_data_direction dir,
struct dma_attrs *attrs);
@@ -32,7 +35,15 @@ static inline void xen_dma_map_page(struct device *hwdev, struct page *page,
dma_addr_t dev_addr, unsigned long offset, size_t size,
enum dma_data_direction dir, struct dma_attrs *attrs)
{
- __generic_dma_ops(hwdev)->map_page(hwdev, page, offset, size, dir, attrs);
+ bool local = PFN_DOWN(dev_addr) == page_to_pfn(page);
+ /* Dom0 is mapped 1:1, so if pfn == mfn the page is local otherwise
+ * is a foreign page grant-mapped in dom0. If the page is local we
+ * can safely call the native dma_ops function, otherwise we call
+ * the xen specific function. */
+ if (local)
+ __generic_dma_ops(hwdev)->map_page(hwdev, page, offset, size, dir, attrs);
+ else
+ __xen_dma_map_page(hwdev, page, dev_addr, offset, size, dir, attrs);
}

static inline void xen_dma_unmap_page(struct device *hwdev, dma_addr_t handle,
diff --git a/arch/arm/xen/mm32.c b/arch/arm/xen/mm32.c
index 3ce9dc1..c86919b 100644
--- a/arch/arm/xen/mm32.c
+++ b/arch/arm/xen/mm32.c
@@ -43,6 +43,18 @@ static void __xen_dma_page_cpu_to_dev(struct device *hwdev, dma_addr_t handle,
dma_cache_maint(handle & PAGE_MASK, handle & ~PAGE_MASK, size, dir, DMA_MAP);
}

+void __xen_dma_map_page(struct device *hwdev, struct page *page,
+ dma_addr_t dev_addr, unsigned long offset, size_t size,
+ enum dma_data_direction dir, struct dma_attrs *attrs)
+{
+ if (is_device_dma_coherent(hwdev))
+ return;
+ if (dma_get_attr(DMA_ATTR_SKIP_CPU_SYNC, attrs))
+ return;
+
+ __xen_dma_page_cpu_to_dev(hwdev, dev_addr, size, dir);
+}
+
void __xen_dma_unmap_page(struct device *hwdev, dma_addr_t handle,
size_t size, enum dma_data_direction dir,
struct dma_attrs *attrs)
--
1.7.10.4

2014-11-12 11:41:34

by Stefano Stabellini

[permalink] [raw]
Subject: [PATCH v9 10/13] xen/arm/arm64: introduce xen_arch_need_swiotlb

Introduce an arch specific function to find out whether a particular dma
mapping operation needs to bounce on the swiotlb buffer.

On ARM and ARM64, if the page involved is a foreign page and the device
is not coherent, we need to bounce because at unmap time we cannot
execute any required cache maintenance operations (we don't know how to
find the pfn from the mfn).

No change of behaviour for x86.

Signed-off-by: Stefano Stabellini <[email protected]>
Reviewed-by: David Vrabel <[email protected]>
Reviewed-by: Catalin Marinas <[email protected]>
Acked-by: Ian Campbell <[email protected]>
Acked-by: Konrad Rzeszutek Wilk <[email protected]>

---

Changes in v6:
- fix ts.

Changes in v5:
- fix indentation.
---
arch/arm/include/asm/xen/page.h | 4 ++++
arch/arm/xen/mm.c | 7 +++++++
arch/x86/include/asm/xen/page.h | 7 +++++++
drivers/xen/swiotlb-xen.c | 5 ++++-
4 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/arch/arm/include/asm/xen/page.h b/arch/arm/include/asm/xen/page.h
index 135c24a..68c739b 100644
--- a/arch/arm/include/asm/xen/page.h
+++ b/arch/arm/include/asm/xen/page.h
@@ -107,4 +107,8 @@ static inline bool set_phys_to_machine(unsigned long pfn, unsigned long mfn)
#define xen_remap(cookie, size) ioremap_cache((cookie), (size))
#define xen_unmap(cookie) iounmap((cookie))

+bool xen_arch_need_swiotlb(struct device *dev,
+ unsigned long pfn,
+ unsigned long mfn);
+
#endif /* _ASM_ARM_XEN_PAGE_H */
diff --git a/arch/arm/xen/mm.c b/arch/arm/xen/mm.c
index ab700e1..28ebf3e 100644
--- a/arch/arm/xen/mm.c
+++ b/arch/arm/xen/mm.c
@@ -100,6 +100,13 @@ void __xen_dma_sync_single_for_device(struct device *hwdev,
__xen_dma_page_cpu_to_dev(hwdev, handle, size, dir);
}

+bool xen_arch_need_swiotlb(struct device *dev,
+ unsigned long pfn,
+ unsigned long mfn)
+{
+ return ((pfn != mfn) && !is_device_dma_coherent(dev));
+}
+
int xen_create_contiguous_region(phys_addr_t pstart, unsigned int order,
unsigned int address_bits,
dma_addr_t *dma_handle)
diff --git a/arch/x86/include/asm/xen/page.h b/arch/x86/include/asm/xen/page.h
index c949923..f58ef6c 100644
--- a/arch/x86/include/asm/xen/page.h
+++ b/arch/x86/include/asm/xen/page.h
@@ -236,4 +236,11 @@ void make_lowmem_page_readwrite(void *vaddr);
#define xen_remap(cookie, size) ioremap((cookie), (size));
#define xen_unmap(cookie) iounmap((cookie))

+static inline bool xen_arch_need_swiotlb(struct device *dev,
+ unsigned long pfn,
+ unsigned long mfn)
+{
+ return false;
+}
+
#endif /* _ASM_X86_XEN_PAGE_H */
diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index ad2c5eb..3725ee4 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -399,7 +399,9 @@ dma_addr_t xen_swiotlb_map_page(struct device *dev, struct page *page,
* buffering it.
*/
if (dma_capable(dev, dev_addr, size) &&
- !range_straddles_page_boundary(phys, size) && !swiotlb_force) {
+ !range_straddles_page_boundary(phys, size) &&
+ !xen_arch_need_swiotlb(dev, PFN_DOWN(phys), PFN_DOWN(dev_addr)) &&
+ !swiotlb_force) {
/* we are not interested in the dma_addr returned by
* xen_dma_map_page, only in the potential cache flushes executed
* by the function. */
@@ -557,6 +559,7 @@ xen_swiotlb_map_sg_attrs(struct device *hwdev, struct scatterlist *sgl,
dma_addr_t dev_addr = xen_phys_to_bus(paddr);

if (swiotlb_force ||
+ xen_arch_need_swiotlb(hwdev, PFN_DOWN(paddr), PFN_DOWN(dev_addr)) ||
!dma_capable(hwdev, dev_addr, sg->length) ||
range_straddles_page_boundary(paddr, sg->length)) {
phys_addr_t map = swiotlb_tbl_map_single(hwdev,
--
1.7.10.4

2014-11-12 11:42:18

by Stefano Stabellini

[permalink] [raw]
Subject: [PATCH v9 09/13] xen/arm/arm64: merge xen/mm32.c into xen/mm.c

Merge xen/mm32.c into xen/mm.c.
As a consequence the code gets compiled on arm64 too.

Signed-off-by: Stefano Stabellini <[email protected]>
Reviewed-by: Catalin Marinas <[email protected]>
---
arch/arm/xen/Makefile | 2 +-
arch/arm/xen/mm.c | 84 +++++++++++++++++++++++++
arch/arm/xen/mm32.c | 94 ----------------------------
arch/arm64/include/asm/xen/page-coherent.h | 44 +------------
4 files changed, 86 insertions(+), 138 deletions(-)
delete mode 100644 arch/arm/xen/mm32.c

diff --git a/arch/arm/xen/Makefile b/arch/arm/xen/Makefile
index 1f85bfe..1296952 100644
--- a/arch/arm/xen/Makefile
+++ b/arch/arm/xen/Makefile
@@ -1 +1 @@
-obj-y := enlighten.o hypercall.o grant-table.o p2m.o mm.o mm32.o
+obj-y := enlighten.o hypercall.o grant-table.o p2m.o mm.o
diff --git a/arch/arm/xen/mm.c b/arch/arm/xen/mm.c
index b0e77de..ab700e1 100644
--- a/arch/arm/xen/mm.c
+++ b/arch/arm/xen/mm.c
@@ -1,6 +1,10 @@
+#include <linux/cpu.h>
+#include <linux/dma-mapping.h>
#include <linux/bootmem.h>
#include <linux/gfp.h>
+#include <linux/highmem.h>
#include <linux/export.h>
+#include <linux/of_address.h>
#include <linux/slab.h>
#include <linux/types.h>
#include <linux/dma-mapping.h>
@@ -16,6 +20,86 @@
#include <asm/xen/hypercall.h>
#include <asm/xen/interface.h>

+enum dma_cache_op {
+ DMA_UNMAP,
+ DMA_MAP,
+};
+
+/* functions called by SWIOTLB */
+
+static void dma_cache_maint(dma_addr_t handle, unsigned long offset,
+ size_t size, enum dma_data_direction dir, enum dma_cache_op op)
+{
+ unsigned long pfn;
+ size_t left = size;
+
+ pfn = (handle >> PAGE_SHIFT) + offset / PAGE_SIZE;
+ offset %= PAGE_SIZE;
+
+ do {
+ size_t len = left;
+
+ /* TODO: cache flush */
+
+ offset = 0;
+ pfn++;
+ left -= len;
+ } while (left);
+}
+
+static void __xen_dma_page_dev_to_cpu(struct device *hwdev, dma_addr_t handle,
+ size_t size, enum dma_data_direction dir)
+{
+ dma_cache_maint(handle & PAGE_MASK, handle & ~PAGE_MASK, size, dir, DMA_UNMAP);
+}
+
+static void __xen_dma_page_cpu_to_dev(struct device *hwdev, dma_addr_t handle,
+ size_t size, enum dma_data_direction dir)
+{
+ dma_cache_maint(handle & PAGE_MASK, handle & ~PAGE_MASK, size, dir, DMA_MAP);
+}
+
+void __xen_dma_map_page(struct device *hwdev, struct page *page,
+ dma_addr_t dev_addr, unsigned long offset, size_t size,
+ enum dma_data_direction dir, struct dma_attrs *attrs)
+{
+ if (is_device_dma_coherent(hwdev))
+ return;
+ if (dma_get_attr(DMA_ATTR_SKIP_CPU_SYNC, attrs))
+ return;
+
+ __xen_dma_page_cpu_to_dev(hwdev, dev_addr, size, dir);
+}
+
+void __xen_dma_unmap_page(struct device *hwdev, dma_addr_t handle,
+ size_t size, enum dma_data_direction dir,
+ struct dma_attrs *attrs)
+
+{
+ if (is_device_dma_coherent(hwdev))
+ return;
+ if (dma_get_attr(DMA_ATTR_SKIP_CPU_SYNC, attrs))
+ return;
+
+ __xen_dma_page_dev_to_cpu(hwdev, handle, size, dir);
+}
+
+void __xen_dma_sync_single_for_cpu(struct device *hwdev,
+ dma_addr_t handle, size_t size, enum dma_data_direction dir)
+{
+ if (is_device_dma_coherent(hwdev))
+ return;
+ __xen_dma_page_dev_to_cpu(hwdev, handle, size, dir);
+}
+
+void __xen_dma_sync_single_for_device(struct device *hwdev,
+ dma_addr_t handle, size_t size, enum dma_data_direction dir)
+{
+ if (is_device_dma_coherent(hwdev))
+ return;
+ __xen_dma_page_cpu_to_dev(hwdev, handle, size, dir);
+}
+
int xen_create_contiguous_region(phys_addr_t pstart, unsigned int order,
unsigned int address_bits,
dma_addr_t *dma_handle)
diff --git a/arch/arm/xen/mm32.c b/arch/arm/xen/mm32.c
deleted file mode 100644
index c86919b..0000000
--- a/arch/arm/xen/mm32.c
+++ /dev/null
@@ -1,94 +0,0 @@
-#include <linux/cpu.h>
-#include <linux/dma-mapping.h>
-#include <linux/gfp.h>
-#include <linux/highmem.h>
-
-#include <xen/features.h>
-enum dma_cache_op {
- DMA_UNMAP,
- DMA_MAP,
-};
-
-/* functions called by SWIOTLB */
-
-static void dma_cache_maint(dma_addr_t handle, unsigned long offset,
- size_t size, enum dma_data_direction dir, enum dma_cache_op op)
-{
- unsigned long pfn;
- size_t left = size;
-
- pfn = (handle >> PAGE_SHIFT) + offset / PAGE_SIZE;
- offset %= PAGE_SIZE;
-
- do {
- size_t len = left;
-
- /* TODO: cache flush */
-
- offset = 0;
- pfn++;
- left -= len;
- } while (left);
-}
-
-static void __xen_dma_page_dev_to_cpu(struct device *hwdev, dma_addr_t handle,
- size_t size, enum dma_data_direction dir)
-{
- dma_cache_maint(handle & PAGE_MASK, handle & ~PAGE_MASK, size, dir, DMA_UNMAP);
-}
-
-static void __xen_dma_page_cpu_to_dev(struct device *hwdev, dma_addr_t handle,
- size_t size, enum dma_data_direction dir)
-{
- dma_cache_maint(handle & PAGE_MASK, handle & ~PAGE_MASK, size, dir, DMA_MAP);
-}
-
-void __xen_dma_map_page(struct device *hwdev, struct page *page,
- dma_addr_t dev_addr, unsigned long offset, size_t size,
- enum dma_data_direction dir, struct dma_attrs *attrs)
-{
- if (is_device_dma_coherent(hwdev))
- return;
- if (dma_get_attr(DMA_ATTR_SKIP_CPU_SYNC, attrs))
- return;
-
- __xen_dma_page_cpu_to_dev(hwdev, dev_addr, size, dir);
-}
-
-void __xen_dma_unmap_page(struct device *hwdev, dma_addr_t handle,
- size_t size, enum dma_data_direction dir,
- struct dma_attrs *attrs)
-
-{
- if (is_device_dma_coherent(hwdev))
- return;
- if (dma_get_attr(DMA_ATTR_SKIP_CPU_SYNC, attrs))
- return;
-
- __xen_dma_page_dev_to_cpu(hwdev, handle, size, dir);
-}
-
-void __xen_dma_sync_single_for_cpu(struct device *hwdev,
- dma_addr_t handle, size_t size, enum dma_data_direction dir)
-{
- if (is_device_dma_coherent(hwdev))
- return;
- __xen_dma_page_dev_to_cpu(hwdev, handle, size, dir);
-}
-
-void __xen_dma_sync_single_for_device(struct device *hwdev,
- dma_addr_t handle, size_t size, enum dma_data_direction dir)
-{
- if (is_device_dma_coherent(hwdev))
- return;
- __xen_dma_page_cpu_to_dev(hwdev, handle, size, dir);
-}
-
-int __init xen_mm32_init(void)
-{
- if (!xen_initial_domain())
- return 0;
-
- return 0;
-}
-arch_initcall(xen_mm32_init);
diff --git a/arch/arm64/include/asm/xen/page-coherent.h b/arch/arm64/include/asm/xen/page-coherent.h
index d7cd4c2..2052102 100644
--- a/arch/arm64/include/asm/xen/page-coherent.h
+++ b/arch/arm64/include/asm/xen/page-coherent.h
@@ -1,43 +1 @@
-#ifndef _ASM_ARM64_XEN_PAGE_COHERENT_H
-#define _ASM_ARM64_XEN_PAGE_COHERENT_H
-
-#include <asm/page.h>
-#include <linux/dma-attrs.h>
-#include <linux/dma-mapping.h>
-
-static inline void *xen_alloc_coherent_pages(struct device *hwdev, size_t size,
- dma_addr_t *dma_handle, gfp_t flags,
- struct dma_attrs *attrs)
-{
- return __generic_dma_ops(hwdev)->alloc(hwdev, size, dma_handle, flags, attrs);
-}
-
-static inline void xen_free_coherent_pages(struct device *hwdev, size_t size,
- void *cpu_addr, dma_addr_t dma_handle,
- struct dma_attrs *attrs)
-{
- __generic_dma_ops(hwdev)->free(hwdev, size, cpu_addr, dma_handle, attrs);
-}
-
-static inline void xen_dma_map_page(struct device *hwdev, struct page *page,
- dma_addr_t dev_addr, unsigned long offset, size_t size,
- enum dma_data_direction dir, struct dma_attrs *attrs)
-{
-}
-
-static inline void xen_dma_unmap_page(struct device *hwdev, dma_addr_t handle,
- size_t size, enum dma_data_direction dir,
- struct dma_attrs *attrs)
-{
-}
-
-static inline void xen_dma_sync_single_for_cpu(struct device *hwdev,
- dma_addr_t handle, size_t size, enum dma_data_direction dir)
-{
-}
-
-static inline void xen_dma_sync_single_for_device(struct device *hwdev,
- dma_addr_t handle, size_t size, enum dma_data_direction dir)
-{
-}
-#endif /* _ASM_ARM64_XEN_PAGE_COHERENT_H */
+#include <../../arm/include/asm/xen/page-coherent.h>
--
1.7.10.4

2014-11-12 11:41:20

by Stefano Stabellini

[permalink] [raw]
Subject: [PATCH v9 01/13] xen/arm: remove handling of XENFEAT_grant_map_identity

The feature has been removed from Xen. Also Linux cannot use it on ARM32
without CONFIG_ARM_LPAE.

Signed-off-by: Stefano Stabellini <[email protected]>
Acked-by: Ian Campbell <[email protected]>
Reviewed-by: David Vrabel <[email protected]>
Reviewed-by: Catalin Marinas <[email protected]>

---
Changes in v2:

- remove the definition of XENFEAT_grant_map_identity.
---
arch/arm/xen/enlighten.c | 5 ---
arch/arm/xen/mm32.c | 85 +-------------------------------------
include/xen/interface/features.h | 3 --
3 files changed, 1 insertion(+), 92 deletions(-)

diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c
index 0e15f01..c7ca936 100644
--- a/arch/arm/xen/enlighten.c
+++ b/arch/arm/xen/enlighten.c
@@ -261,11 +261,6 @@ static int __init xen_guest_init(void)

xen_setup_features();

- if (!xen_feature(XENFEAT_grant_map_identity)) {
- pr_warn("Please upgrade your Xen.\n"
- "If your platform has any non-coherent DMA devices, they won't work properly.\n");
- }
-
if (xen_feature(XENFEAT_dom0))
xen_start_info->flags |= SIF_INITDOMAIN|SIF_PRIVILEGED;
else
diff --git a/arch/arm/xen/mm32.c b/arch/arm/xen/mm32.c
index 3b99860..a5a93fc 100644
--- a/arch/arm/xen/mm32.c
+++ b/arch/arm/xen/mm32.c
@@ -5,70 +5,6 @@

#include <xen/features.h>

-static DEFINE_PER_CPU(unsigned long, xen_mm32_scratch_virt);
-static DEFINE_PER_CPU(pte_t *, xen_mm32_scratch_ptep);
-
-static int alloc_xen_mm32_scratch_page(int cpu)
-{
- struct page *page;
- unsigned long virt;
- pmd_t *pmdp;
- pte_t *ptep;
-
- if (per_cpu(xen_mm32_scratch_ptep, cpu) != NULL)
- return 0;
-
- page = alloc_page(GFP_KERNEL);
- if (page == NULL) {
- pr_warn("Failed to allocate xen_mm32_scratch_page for cpu %d\n", cpu);
- return -ENOMEM;
- }
-
- virt = (unsigned long)__va(page_to_phys(page));
- pmdp = pmd_offset(pud_offset(pgd_offset_k(virt), virt), virt);
- ptep = pte_offset_kernel(pmdp, virt);
-
- per_cpu(xen_mm32_scratch_virt, cpu) = virt;
- per_cpu(xen_mm32_scratch_ptep, cpu) = ptep;
-
- return 0;
-}
-
-static int xen_mm32_cpu_notify(struct notifier_block *self,
- unsigned long action, void *hcpu)
-{
- int cpu = (long)hcpu;
- switch (action) {
- case CPU_UP_PREPARE:
- if (alloc_xen_mm32_scratch_page(cpu))
- return NOTIFY_BAD;
- break;
- default:
- break;
- }
- return NOTIFY_OK;
-}
-
-static struct notifier_block xen_mm32_cpu_notifier = {
- .notifier_call = xen_mm32_cpu_notify,
-};
-
-static void* xen_mm32_remap_page(dma_addr_t handle)
-{
- unsigned long virt = get_cpu_var(xen_mm32_scratch_virt);
- pte_t *ptep = __get_cpu_var(xen_mm32_scratch_ptep);
-
- *ptep = pfn_pte(handle >> PAGE_SHIFT, PAGE_KERNEL);
- local_flush_tlb_kernel_page(virt);
-
- return (void*)virt;
-}
-
-static void xen_mm32_unmap(void *vaddr)
-{
- put_cpu_var(xen_mm32_scratch_virt);
-}
-

/* functions called by SWIOTLB */

@@ -88,13 +24,7 @@ static void dma_cache_maint(dma_addr_t handle, unsigned long offset,

if (!pfn_valid(pfn))
{
- /* Cannot map the page, we don't know its physical address.
- * Return and hope for the best */
- if (!xen_feature(XENFEAT_grant_map_identity))
- return;
- vaddr = xen_mm32_remap_page(handle) + offset;
- op(vaddr, len, dir);
- xen_mm32_unmap(vaddr - offset);
+ /* TODO: cache flush */
} else {
struct page *page = pfn_to_page(pfn);

@@ -181,22 +111,9 @@ void xen_dma_sync_single_for_device(struct device *hwdev,

int __init xen_mm32_init(void)
{
- int cpu;
-
if (!xen_initial_domain())
return 0;

- register_cpu_notifier(&xen_mm32_cpu_notifier);
- get_online_cpus();
- for_each_online_cpu(cpu) {
- if (alloc_xen_mm32_scratch_page(cpu)) {
- put_online_cpus();
- unregister_cpu_notifier(&xen_mm32_cpu_notifier);
- return -ENOMEM;
- }
- }
- put_online_cpus();
-
return 0;
}
arch_initcall(xen_mm32_init);
diff --git a/include/xen/interface/features.h b/include/xen/interface/features.h
index 14334d0..131a6cc 100644
--- a/include/xen/interface/features.h
+++ b/include/xen/interface/features.h
@@ -53,9 +53,6 @@
/* operation as Dom0 is supported */
#define XENFEAT_dom0 11

-/* Xen also maps grant references at pfn = mfn */
-#define XENFEAT_grant_map_identity 12
-
#define XENFEAT_NR_SUBMAPS 1

#endif /* __XEN_PUBLIC_FEATURES_H__ */
--
1.7.10.4

2014-11-12 11:42:51

by Stefano Stabellini

[permalink] [raw]
Subject: [PATCH v9 03/13] xen/arm: if(pfn_valid(pfn)) call native dma_ops

Remove code duplication in mm32.c by calling the native dma_ops if the
page is a local page (not a foreign page). Use a simple pfn_valid(pfn)
check to figure out if the page is local, exploiting the fact that dom0
is mapped 1:1, therefore pfn_valid always returns false when called on a
foreign mfn.

Suggested-by: Catalin Marinas <[email protected]>
Signed-off-by: Stefano Stabellini <[email protected]>
Reviewed-by: Catalin Marinas <[email protected]>

---

Changes in v9:

- remove BUG_ON from the loop;
- add static inline for xen_dma_unmap_page, xen_dma_sync_single_for_cpu,
xen_dma_sync_single_for_device.
---
arch/arm/include/asm/xen/page-coherent.h | 49 +++++++++++++++++++++++++----
arch/arm/xen/mm32.c | 50 +++++++-----------------------
2 files changed, 54 insertions(+), 45 deletions(-)

diff --git a/arch/arm/include/asm/xen/page-coherent.h b/arch/arm/include/asm/xen/page-coherent.h
index e8275ea..9cfd895 100644
--- a/arch/arm/include/asm/xen/page-coherent.h
+++ b/arch/arm/include/asm/xen/page-coherent.h
@@ -5,6 +5,15 @@
#include <linux/dma-attrs.h>
#include <linux/dma-mapping.h>

+void __xen_dma_unmap_page(struct device *hwdev, dma_addr_t handle,
+ size_t size, enum dma_data_direction dir,
+ struct dma_attrs *attrs);
+void __xen_dma_sync_single_for_cpu(struct device *hwdev,
+ dma_addr_t handle, size_t size, enum dma_data_direction dir);
+
+void __xen_dma_sync_single_for_device(struct device *hwdev,
+ dma_addr_t handle, size_t size, enum dma_data_direction dir);
+
static inline void *xen_alloc_coherent_pages(struct device *hwdev, size_t size,
dma_addr_t *dma_handle, gfp_t flags,
struct dma_attrs *attrs)
@@ -26,14 +35,42 @@ static inline void xen_dma_map_page(struct device *hwdev, struct page *page,
__generic_dma_ops(hwdev)->map_page(hwdev, page, offset, size, dir, attrs);
}

-void xen_dma_unmap_page(struct device *hwdev, dma_addr_t handle,
+static inline void xen_dma_unmap_page(struct device *hwdev, dma_addr_t handle,
size_t size, enum dma_data_direction dir,
- struct dma_attrs *attrs);
+ struct dma_attrs *attrs)
+{
+ unsigned long pfn = PFN_DOWN(handle);
+ /* Dom0 is mapped 1:1, so calling pfn_valid on a foreign mfn will
+ * always return false. If the page is local we can safely call the
+ * native dma_ops function, otherwise we call the xen specific
+ * function. */
+ if (pfn_valid(pfn)) {
+ if (__generic_dma_ops(hwdev)->unmap_page)
+ __generic_dma_ops(hwdev)->unmap_page(hwdev, handle, size, dir, attrs);
+ } else
+ __xen_dma_unmap_page(hwdev, handle, size, dir, attrs);
+}

-void xen_dma_sync_single_for_cpu(struct device *hwdev,
- dma_addr_t handle, size_t size, enum dma_data_direction dir);
+static inline void xen_dma_sync_single_for_cpu(struct device *hwdev,
+ dma_addr_t handle, size_t size, enum dma_data_direction dir)
+{
+ unsigned long pfn = PFN_DOWN(handle);
+ if (pfn_valid(pfn)) {
+ if (__generic_dma_ops(hwdev)->sync_single_for_cpu)
+ __generic_dma_ops(hwdev)->sync_single_for_cpu(hwdev, handle, size, dir);
+ } else
+ __xen_dma_sync_single_for_cpu(hwdev, handle, size, dir);
+}

-void xen_dma_sync_single_for_device(struct device *hwdev,
- dma_addr_t handle, size_t size, enum dma_data_direction dir);
+static inline void xen_dma_sync_single_for_device(struct device *hwdev,
+ dma_addr_t handle, size_t size, enum dma_data_direction dir)
+{
+ unsigned long pfn = PFN_DOWN(handle);
+ if (pfn_valid(pfn)) {
+ if (__generic_dma_ops(hwdev)->sync_single_for_device)
+ __generic_dma_ops(hwdev)->sync_single_for_device(hwdev, handle, size, dir);
+ } else
+ __xen_dma_sync_single_for_device(hwdev, handle, size, dir);
+}

#endif /* _ASM_ARM_XEN_PAGE_COHERENT_H */
diff --git a/arch/arm/xen/mm32.c b/arch/arm/xen/mm32.c
index 6153d61..5bb8391 100644
--- a/arch/arm/xen/mm32.c
+++ b/arch/arm/xen/mm32.c
@@ -4,13 +4,15 @@
#include <linux/highmem.h>

#include <xen/features.h>
-
+enum dma_cache_op {
+ DMA_UNMAP,
+ DMA_MAP,
+};

/* functions called by SWIOTLB */

static void dma_cache_maint(dma_addr_t handle, unsigned long offset,
- size_t size, enum dma_data_direction dir,
- void (*op)(const void *, size_t, int))
+ size_t size, enum dma_data_direction dir, enum dma_cache_op op)
{
unsigned long pfn;
size_t left = size;
@@ -20,34 +22,8 @@ static void dma_cache_maint(dma_addr_t handle, unsigned long offset,

do {
size_t len = left;
- void *vaddr;

- if (!pfn_valid(pfn))
- {
- /* TODO: cache flush */
- } else {
- struct page *page = pfn_to_page(pfn);
-
- if (PageHighMem(page)) {
- if (len + offset > PAGE_SIZE)
- len = PAGE_SIZE - offset;
-
- if (cache_is_vipt_nonaliasing()) {
- vaddr = kmap_atomic(page);
- op(vaddr + offset, len, dir);
- kunmap_atomic(vaddr);
- } else {
- vaddr = kmap_high_get(page);
- if (vaddr) {
- op(vaddr + offset, len, dir);
- kunmap_high(page);
- }
- }
- } else {
- vaddr = page_address(page) + offset;
- op(vaddr, len, dir);
- }
- }
+ /* TODO: cache flush */

offset = 0;
pfn++;
@@ -58,20 +34,16 @@ static void dma_cache_maint(dma_addr_t handle, unsigned long offset,
static void __xen_dma_page_dev_to_cpu(struct device *hwdev, dma_addr_t handle,
size_t size, enum dma_data_direction dir)
{
- /* Cannot use __dma_page_dev_to_cpu because we don't have a
- * struct page for handle */
-
- dma_cache_maint(handle & PAGE_MASK, handle & ~PAGE_MASK, size, dir, dmac_unmap_area);
+ dma_cache_maint(handle & PAGE_MASK, handle & ~PAGE_MASK, size, dir, DMA_UNMAP);
}

static void __xen_dma_page_cpu_to_dev(struct device *hwdev, dma_addr_t handle,
size_t size, enum dma_data_direction dir)
{
-
- dma_cache_maint(handle & PAGE_MASK, handle & ~PAGE_MASK, size, dir, dmac_map_area);
+ dma_cache_maint(handle & PAGE_MASK, handle & ~PAGE_MASK, size, dir, DMA_MAP);
}

-void xen_dma_unmap_page(struct device *hwdev, dma_addr_t handle,
+void __xen_dma_unmap_page(struct device *hwdev, dma_addr_t handle,
size_t size, enum dma_data_direction dir,
struct dma_attrs *attrs)

@@ -84,7 +56,7 @@ void xen_dma_unmap_page(struct device *hwdev, dma_addr_t handle,
__xen_dma_page_dev_to_cpu(hwdev, handle, size, dir);
}

-void xen_dma_sync_single_for_cpu(struct device *hwdev,
+void __xen_dma_sync_single_for_cpu(struct device *hwdev,
dma_addr_t handle, size_t size, enum dma_data_direction dir)
{
if (!__generic_dma_ops(hwdev)->sync_single_for_cpu)
@@ -92,7 +64,7 @@ void xen_dma_sync_single_for_cpu(struct device *hwdev,
__xen_dma_page_dev_to_cpu(hwdev, handle, size, dir);
}

-void xen_dma_sync_single_for_device(struct device *hwdev,
+void __xen_dma_sync_single_for_device(struct device *hwdev,
dma_addr_t handle, size_t size, enum dma_data_direction dir)
{
if (!__generic_dma_ops(hwdev)->sync_single_for_device)
--
1.7.10.4

2014-11-12 11:42:50

by Stefano Stabellini

[permalink] [raw]
Subject: [PATCH v9 07/13] xen: add a dma_addr_t dev_addr argument to xen_dma_map_page

dev_addr is the machine address of the page.

The new parameter can be used by the ARM and ARM64 implementations of
xen_dma_map_page to find out if the page is a local page (pfn == mfn) or
a foreign page (pfn != mfn).

dev_addr could be retrieved again from the physical address, using
pfn_to_mfn, but it requires accessing an rbtree. Since we already have
the dev_addr in our hands at the call site there is no need to get the
mfn twice.

Signed-off-by: Stefano Stabellini <[email protected]>
Reviewed-by: Catalin Marinas <[email protected]>
---
arch/arm/include/asm/xen/page-coherent.h | 4 ++--
arch/arm64/include/asm/xen/page-coherent.h | 4 ++--
arch/x86/include/asm/xen/page-coherent.h | 4 ++--
drivers/xen/swiotlb-xen.c | 6 ++++--
4 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/arch/arm/include/asm/xen/page-coherent.h b/arch/arm/include/asm/xen/page-coherent.h
index 9cfd895..a309f42 100644
--- a/arch/arm/include/asm/xen/page-coherent.h
+++ b/arch/arm/include/asm/xen/page-coherent.h
@@ -29,8 +29,8 @@ static inline void xen_free_coherent_pages(struct device *hwdev, size_t size,
}

static inline void xen_dma_map_page(struct device *hwdev, struct page *page,
- unsigned long offset, size_t size, enum dma_data_direction dir,
- struct dma_attrs *attrs)
+ dma_addr_t dev_addr, unsigned long offset, size_t size,
+ enum dma_data_direction dir, struct dma_attrs *attrs)
{
__generic_dma_ops(hwdev)->map_page(hwdev, page, offset, size, dir, attrs);
}
diff --git a/arch/arm64/include/asm/xen/page-coherent.h b/arch/arm64/include/asm/xen/page-coherent.h
index dde3fc9..d7cd4c2 100644
--- a/arch/arm64/include/asm/xen/page-coherent.h
+++ b/arch/arm64/include/asm/xen/page-coherent.h
@@ -20,8 +20,8 @@ static inline void xen_free_coherent_pages(struct device *hwdev, size_t size,
}

static inline void xen_dma_map_page(struct device *hwdev, struct page *page,
- unsigned long offset, size_t size, enum dma_data_direction dir,
- struct dma_attrs *attrs)
+ dma_addr_t dev_addr, unsigned long offset, size_t size,
+ enum dma_data_direction dir, struct dma_attrs *attrs)
{
}

diff --git a/arch/x86/include/asm/xen/page-coherent.h b/arch/x86/include/asm/xen/page-coherent.h
index 7f02fe4..acd844c 100644
--- a/arch/x86/include/asm/xen/page-coherent.h
+++ b/arch/x86/include/asm/xen/page-coherent.h
@@ -22,8 +22,8 @@ static inline void xen_free_coherent_pages(struct device *hwdev, size_t size,
}

static inline void xen_dma_map_page(struct device *hwdev, struct page *page,
- unsigned long offset, size_t size, enum dma_data_direction dir,
- struct dma_attrs *attrs) { }
+ dma_addr_t dev_addr, unsigned long offset, size_t size,
+ enum dma_data_direction dir, struct dma_attrs *attrs) { }

static inline void xen_dma_unmap_page(struct device *hwdev, dma_addr_t handle,
size_t size, enum dma_data_direction dir,
diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index ebd8f21..ad2c5eb 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -403,7 +403,7 @@ dma_addr_t xen_swiotlb_map_page(struct device *dev, struct page *page,
/* we are not interested in the dma_addr returned by
* xen_dma_map_page, only in the potential cache flushes executed
* by the function. */
- xen_dma_map_page(dev, page, offset, size, dir, attrs);
+ xen_dma_map_page(dev, page, dev_addr, offset, size, dir, attrs);
return dev_addr;
}

@@ -417,7 +417,7 @@ dma_addr_t xen_swiotlb_map_page(struct device *dev, struct page *page,
return DMA_ERROR_CODE;

xen_dma_map_page(dev, pfn_to_page(map >> PAGE_SHIFT),
- map & ~PAGE_MASK, size, dir, attrs);
+ dev_addr, map & ~PAGE_MASK, size, dir, attrs);
dev_addr = xen_phys_to_bus(map);

/*
@@ -574,6 +574,7 @@ xen_swiotlb_map_sg_attrs(struct device *hwdev, struct scatterlist *sgl,
return 0;
}
xen_dma_map_page(hwdev, pfn_to_page(map >> PAGE_SHIFT),
+ dev_addr,
map & ~PAGE_MASK,
sg->length,
dir,
@@ -584,6 +585,7 @@ xen_swiotlb_map_sg_attrs(struct device *hwdev, struct scatterlist *sgl,
* xen_dma_map_page, only in the potential cache flushes executed
* by the function. */
xen_dma_map_page(hwdev, pfn_to_page(paddr >> PAGE_SHIFT),
+ dev_addr,
paddr & ~PAGE_MASK,
sg->length,
dir,
--
1.7.10.4

2014-11-12 11:43:52

by Stefano Stabellini

[permalink] [raw]
Subject: [PATCH v9 05/13] arm: introduce is_device_dma_coherent

Introduce a boolean flag and an accessor function to check whether a
device is dma_coherent. Set the flag from set_arch_dma_coherent_ops.

Signed-off-by: Stefano Stabellini <[email protected]>
Signed-off-by: Catalin Marinas <[email protected]>
Reviewed-by: Catalin Marinas <[email protected]>
CC: [email protected]
---
arch/arm/include/asm/device.h | 1 +
arch/arm/include/asm/dma-mapping.h | 6 ++++++
2 files changed, 7 insertions(+)

diff --git a/arch/arm/include/asm/device.h b/arch/arm/include/asm/device.h
index dc662fc..4111592 100644
--- a/arch/arm/include/asm/device.h
+++ b/arch/arm/include/asm/device.h
@@ -17,6 +17,7 @@ struct dev_archdata {
#ifdef CONFIG_ARM_DMA_USE_IOMMU
struct dma_iommu_mapping *mapping;
#endif
+ bool dma_coherent;
};

struct omap_device;
diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h
index 85738b2..8c3b616 100644
--- a/arch/arm/include/asm/dma-mapping.h
+++ b/arch/arm/include/asm/dma-mapping.h
@@ -123,11 +123,17 @@ static inline unsigned long dma_max_pfn(struct device *dev)

static inline int set_arch_dma_coherent_ops(struct device *dev)
{
+ dev->archdata.dma_coherent = true;
set_dma_ops(dev, &arm_coherent_dma_ops);
return 0;
}
#define set_arch_dma_coherent_ops(dev) set_arch_dma_coherent_ops(dev)

+static inline bool is_device_dma_coherent(struct device *dev)
+{
+ return dev->archdata.dma_coherent;
+}
+
static inline dma_addr_t phys_to_dma(struct device *dev, phys_addr_t paddr)
{
unsigned int offset = paddr & ~PAGE_MASK;
--
1.7.10.4

2014-11-12 11:44:08

by Stefano Stabellini

[permalink] [raw]
Subject: [PATCH v9 02/13] xen/arm: remove outer_*_range call

Dom0 is not actually capable of issuing outer_inv_range or
outer_clean_range calls.

Signed-off-by: Stefano Stabellini <[email protected]>
Acked-by: Ian Campbell <[email protected]>
Reviewed-by: Catalin Marinas <[email protected]>
---
arch/arm/xen/mm32.c | 9 ---------
1 file changed, 9 deletions(-)

diff --git a/arch/arm/xen/mm32.c b/arch/arm/xen/mm32.c
index a5a93fc..6153d61 100644
--- a/arch/arm/xen/mm32.c
+++ b/arch/arm/xen/mm32.c
@@ -61,9 +61,6 @@ static void __xen_dma_page_dev_to_cpu(struct device *hwdev, dma_addr_t handle,
/* Cannot use __dma_page_dev_to_cpu because we don't have a
* struct page for handle */

- if (dir != DMA_TO_DEVICE)
- outer_inv_range(handle, handle + size);
-
dma_cache_maint(handle & PAGE_MASK, handle & ~PAGE_MASK, size, dir, dmac_unmap_area);
}

@@ -72,12 +69,6 @@ static void __xen_dma_page_cpu_to_dev(struct device *hwdev, dma_addr_t handle,
{

dma_cache_maint(handle & PAGE_MASK, handle & ~PAGE_MASK, size, dir, dmac_map_area);
-
- if (dir == DMA_FROM_DEVICE) {
- outer_inv_range(handle, handle + size);
- } else {
- outer_clean_range(handle, handle + size);
- }
}

void xen_dma_unmap_page(struct device *hwdev, dma_addr_t handle,
--
1.7.10.4

2014-11-12 11:47:55

by Stefano Stabellini

[permalink] [raw]
Subject: [PATCH v9 13/13] swiotlb-xen: remove BUG_ON in xen_bus_to_phys

On x86 truncation cannot occur because config XEN depends on X86_64 ||
(X86_32 && X86_PAE).

On ARM truncation can occur without CONFIG_ARM_LPAE, when the dma
operation involves foreign grants. However in that case the physical
address returned by xen_bus_to_phys is actually invalid (there is no mfn
to pfn tracking for foreign grants on ARM) and it is not used.

Signed-off-by: Stefano Stabellini <[email protected]>
Reviewed-by: Catalin Marinas <[email protected]>
---
drivers/xen/swiotlb-xen.c | 2 --
1 file changed, 2 deletions(-)

diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index 498b654..153cf14 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -96,8 +96,6 @@ static inline phys_addr_t xen_bus_to_phys(dma_addr_t baddr)
dma_addr_t dma = (dma_addr_t)pfn << PAGE_SHIFT;
phys_addr_t paddr = dma;

- BUG_ON(paddr != dma); /* truncation has occurred, should never happen */
-
paddr |= baddr & ~PAGE_MASK;

return paddr;
--
1.7.10.4

2014-11-12 11:48:05

by Stefano Stabellini

[permalink] [raw]
Subject: [PATCH v9 11/13] xen/arm: introduce GNTTABOP_cache_flush

Introduce support for new hypercall GNTTABOP_cache_flush.
Use it to perform cache flashing on pages used for dma when necessary.

If GNTTABOP_cache_flush is supported by the hypervisor, we don't need to
bounce dma map operations that involve foreign grants and non-coherent
devices.

Signed-off-by: Stefano Stabellini <[email protected]>
Reviewed-by: Catalin Marinas <[email protected]>
Acked-by: Ian Campbell <[email protected]>

---

Changes in v5:
- rename hypercall_flush to hypercall_cflush;
- remove spurious change.

Changes in v4:
- add comment;
- avoid bouncing dma map operations that involve foreign grants and
non-coherent devices if GNTTABOP_cache_flush is provided by Xen.

Changes in v3:
- fix the cache maintenance op call to match what Linux does natively;
- update the hypercall interface to match Xen.

Changes in v2:
- update the hypercall interface to match Xen;
- call the interface on a single page at a time.
---
arch/arm/xen/mm.c | 34 ++++++++++++++++++++++++++++++++--
include/xen/interface/grant_table.h | 19 +++++++++++++++++++
2 files changed, 51 insertions(+), 2 deletions(-)

diff --git a/arch/arm/xen/mm.c b/arch/arm/xen/mm.c
index 28ebf3e..351b24a 100644
--- a/arch/arm/xen/mm.c
+++ b/arch/arm/xen/mm.c
@@ -12,6 +12,7 @@
#include <linux/swiotlb.h>

#include <xen/xen.h>
+#include <xen/interface/grant_table.h>
#include <xen/interface/memory.h>
#include <xen/swiotlb-xen.h>

@@ -24,12 +25,14 @@ enum dma_cache_op {
DMA_UNMAP,
DMA_MAP,
};
+static bool hypercall_cflush = false;

/* functions called by SWIOTLB */

static void dma_cache_maint(dma_addr_t handle, unsigned long offset,
size_t size, enum dma_data_direction dir, enum dma_cache_op op)
{
+ struct gnttab_cache_flush cflush;
unsigned long pfn;
size_t left = size;

@@ -39,7 +42,26 @@ static void dma_cache_maint(dma_addr_t handle, unsigned long offset,
do {
size_t len = left;

- /* TODO: cache flush */
+ /* buffers in highmem or foreign pages cannot cross page
+ * boundaries */
+ if (len + offset > PAGE_SIZE)
+ len = PAGE_SIZE - offset;
+
+ cflush.op = 0;
+ cflush.a.dev_bus_addr = pfn << PAGE_SHIFT;
+ cflush.offset = offset;
+ cflush.length = len;
+
+ if (op == DMA_UNMAP && dir != DMA_TO_DEVICE)
+ cflush.op = GNTTAB_CACHE_INVAL;
+ if (op == DMA_MAP) {
+ if (dir == DMA_FROM_DEVICE)
+ cflush.op = GNTTAB_CACHE_INVAL;
+ else
+ cflush.op = GNTTAB_CACHE_CLEAN;
+ }
+ if (cflush.op)
+ HYPERVISOR_grant_table_op(GNTTABOP_cache_flush, &cflush, 1);

offset = 0;
pfn++;
@@ -104,7 +126,7 @@ bool xen_arch_need_swiotlb(struct device *dev,
unsigned long pfn,
unsigned long mfn)
{
- return ((pfn != mfn) && !is_device_dma_coherent(dev));
+ return (!hypercall_cflush && (pfn != mfn) && !is_device_dma_coherent(dev));
}

int xen_create_contiguous_region(phys_addr_t pstart, unsigned int order,
@@ -147,10 +169,18 @@ static struct dma_map_ops xen_swiotlb_dma_ops = {

int __init xen_mm_init(void)
{
+ struct gnttab_cache_flush cflush;
if (!xen_initial_domain())
return 0;
xen_swiotlb_init(1, false);
xen_dma_ops = &xen_swiotlb_dma_ops;
+
+ cflush.op = 0;
+ cflush.a.dev_bus_addr = 0;
+ cflush.offset = 0;
+ cflush.length = 0;
+ if (HYPERVISOR_grant_table_op(GNTTABOP_cache_flush, &cflush, 1) != -ENOSYS)
+ hypercall_cflush = true;
return 0;
}
arch_initcall(xen_mm_init);
diff --git a/include/xen/interface/grant_table.h b/include/xen/interface/grant_table.h
index e40fae9..bcce564 100644
--- a/include/xen/interface/grant_table.h
+++ b/include/xen/interface/grant_table.h
@@ -479,6 +479,25 @@ struct gnttab_get_version {
DEFINE_GUEST_HANDLE_STRUCT(gnttab_get_version);

/*
+ * Issue one or more cache maintenance operations on a portion of a
+ * page granted to the calling domain by a foreign domain.
+ */
+#define GNTTABOP_cache_flush 12
+struct gnttab_cache_flush {
+ union {
+ uint64_t dev_bus_addr;
+ grant_ref_t ref;
+ } a;
+ uint16_t offset; /* offset from start of grant */
+ uint16_t length; /* size within the grant */
+#define GNTTAB_CACHE_CLEAN (1<<0)
+#define GNTTAB_CACHE_INVAL (1<<1)
+#define GNTTAB_CACHE_SOURCE_GREF (1<<31)
+ uint32_t op;
+};
+DEFINE_GUEST_HANDLE_STRUCT(gnttab_cache_flush);
+
+/*
* Bitfield values for update_pin_status.flags.
*/
/* Map the grant entry for access by I/O devices. */
--
1.7.10.4

2014-11-12 11:48:13

by Stefano Stabellini

[permalink] [raw]
Subject: [PATCH v9 12/13] swiotlb-xen: pass dev_addr to xen_dma_unmap_page and xen_dma_sync_single_for_cpu

xen_dma_unmap_page and xen_dma_sync_single_for_cpu take a dma_addr_t
handle as argument, not a physical address.

Signed-off-by: Stefano Stabellini <[email protected]>
Reviewed-by: Catalin Marinas <[email protected]>
---
drivers/xen/swiotlb-xen.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index 3725ee4..498b654 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -449,7 +449,7 @@ static void xen_unmap_single(struct device *hwdev, dma_addr_t dev_addr,

BUG_ON(dir == DMA_NONE);

- xen_dma_unmap_page(hwdev, paddr, size, dir, attrs);
+ xen_dma_unmap_page(hwdev, dev_addr, size, dir, attrs);

/* NOTE: We use dev_addr here, not paddr! */
if (is_xen_swiotlb_buffer(dev_addr)) {
@@ -497,14 +497,14 @@ xen_swiotlb_sync_single(struct device *hwdev, dma_addr_t dev_addr,
BUG_ON(dir == DMA_NONE);

if (target == SYNC_FOR_CPU)
- xen_dma_sync_single_for_cpu(hwdev, paddr, size, dir);
+ xen_dma_sync_single_for_cpu(hwdev, dev_addr, size, dir);

/* NOTE: We use dev_addr here, not paddr! */
if (is_xen_swiotlb_buffer(dev_addr))
swiotlb_tbl_sync_single(hwdev, paddr, size, dir, target);

if (target == SYNC_FOR_DEVICE)
- xen_dma_sync_single_for_cpu(hwdev, paddr, size, dir);
+ xen_dma_sync_single_for_cpu(hwdev, dev_addr, size, dir);

if (dir != DMA_FROM_DEVICE)
return;
--
1.7.10.4

2014-11-12 14:22:26

by Julien Grall

[permalink] [raw]
Subject: Re: [Xen-devel] [PATCH v9 0/13] introduce GNTTABOP_cache_flush

Hi,

On 11/12/2014 11:39 AM, Stefano Stabellini wrote:
> Hi all,
> this patch series introduces support for GNTTABOP_cache_flush to perform
> cache maintenance operation on foreign pages and reverts the current
> code based on XENFEAT_grant_map_identity.
>
> It also provides a very slow fallback by bouncing on the swiotlb buffer,
> in case the hypercall is not available.
>
> v7 introduces a flag named dma_coherent in dev_archdata and an accessor
> function named is_device_dma_coherent in asm/dma-mapping.h under arm and
> arm64.

I've been tested this series on non-LPAE kernel and don't see crash anymore:

Tested-by: Julien Grall <[email protected]>

Regards,

> Changes in v9:
> - remove BUG_ON from the loop in dma_cache_maint;
> - add static inline for xen_dma_unmap_page, xen_dma_sync_single_for_cpu,
> xen_dma_sync_single_for_device;
> - map_page is always present, don't check whether it's implemented;
> - use bool local for clarity;
> - add an in-code comment.
>
> Changes in v8:
> - remove code duplication in mm32.c by moving the pfn_valid check in
> page-coherent.h;
> - add a dma_addr_t dev_addr argument to xen_dma_map_page;
> - use hypercall to flush caches in map_page;
> - pass dev_addr to xen_dma_unmap_page and xen_dma_sync_single_for_cpu;
> - remove BUG_ON in xen_bus_to_phys.
>
> Changes in v7:
> - rebased on 3.18-rc1;
> - rename is_dma_coherent to is_device_dma_coherent and move it to
> asm/dma-mapping.h on arm and arm64;
> - introduce a dma_coherent flag on arm and arm64;
> - set the flags from set_arch_dma_coherent_ops.
>
> Changes in v6:
> - rename xen_is_dma_coherent to is_dma_coherent;
> - use of_dma_is_coherent to detect the dma coherency of a device;
> - remove arch/arm64/include/asm/xen/page-coherent.h, include
> ../../arm/include/asm/xen/page-coherent.h instead;
> - fix indentation.
>
> Changes in v5:
> - introduce xen_is_dma_coherent as a xen specific function;
> - call xen_is_dma_coherent instead of is_dma_coherent;
> - introduce xen_is_dma_coherent to
> arch/arm64/include/asm/xen/page-coherent.h;
> - do not remove arch/arm64/include/asm/xen/page-coherent.h, add
> the missing dma_ops calls from it;
> - fix indentation;
> - rename hypercall_flush to hypercall_cflush;
> - remove spurious change.
>
> Changes in v4:
> - remove outer_*_range call;
> - introduce is_dma_coherent;
> - use is_dma_coherent in arch/arm/xen/mm32.c;
> - merge xen/mm32.c into xen/mm.c;
> - xen/arm/arm64: introduce xen_arch_need_swiotlb;
> - avoid bouncing dma map operations that involve foreign grants and
> non-coherent devices if GNTTABOP_cache_flush is provided by Xen.
>
> Changes in v3:
> - fix the cache maintenance op call to match what Linux does natively;
> - update the hypercall interface to match Xen.
>
> Changes in v2:
> - remove the definition of XENFEAT_grant_map_identity;
> - update the hypercall interface to match Xen;
> - call the interface on a single page at a time.
>
>
>
> git://git.kernel.org/pub/scm/linux/kernel/git/sstabellini/xen.git cache_flush_v9
>
> Stefano Stabellini (13):
> xen/arm: remove handling of XENFEAT_grant_map_identity
> xen/arm: remove outer_*_range call
> xen/arm: if(pfn_valid(pfn)) call native dma_ops
> arm64: introduce is_device_dma_coherent
> arm: introduce is_device_dma_coherent
> xen/arm: use is_device_dma_coherent
> xen: add a dma_addr_t dev_addr argument to xen_dma_map_page
> xen/arm: use hypercall to flush caches in map_page
> xen/arm/arm64: merge xen/mm32.c into xen/mm.c
> xen/arm/arm64: introduce xen_arch_need_swiotlb
> xen/arm: introduce GNTTABOP_cache_flush
> swiotlb-xen: pass dev_addr to xen_dma_unmap_page and xen_dma_sync_single_for_cpu
> swiotlb-xen: remove BUG_ON in xen_bus_to_phys
>
> arch/arm/include/asm/device.h | 1 +
> arch/arm/include/asm/dma-mapping.h | 6 +
> arch/arm/include/asm/xen/page-coherent.h | 66 +++++++--
> arch/arm/include/asm/xen/page.h | 4 +
> arch/arm/xen/Makefile | 2 +-
> arch/arm/xen/enlighten.c | 5 -
> arch/arm/xen/mm.c | 121 +++++++++++++++++
> arch/arm/xen/mm32.c | 202 ----------------------------
> arch/arm64/include/asm/device.h | 1 +
> arch/arm64/include/asm/dma-mapping.h | 6 +
> arch/arm64/include/asm/xen/page-coherent.h | 44 +-----
> arch/x86/include/asm/xen/page-coherent.h | 4 +-
> arch/x86/include/asm/xen/page.h | 7 +
> drivers/xen/swiotlb-xen.c | 19 +--
> include/xen/interface/features.h | 3 -
> include/xen/interface/grant_table.h | 19 +++
> 16 files changed, 237 insertions(+), 273 deletions(-)
> delete mode 100644 arch/arm/xen/mm32.c
>
> _______________________________________________
> Xen-devel mailing list
> [email protected]
> http://lists.xen.org/xen-devel
>


--
Julien Grall

2014-11-14 12:00:13

by Stefano Stabellini

[permalink] [raw]
Subject: Re: [PATCH v9 05/13] arm: introduce is_device_dma_coherent

Russell,
this patch needs your feedback.

- Stefano

On Wed, 12 Nov 2014, Stefano Stabellini wrote:
> Introduce a boolean flag and an accessor function to check whether a
> device is dma_coherent. Set the flag from set_arch_dma_coherent_ops.
>
> Signed-off-by: Stefano Stabellini <[email protected]>
> Signed-off-by: Catalin Marinas <[email protected]>
> Reviewed-by: Catalin Marinas <[email protected]>
> CC: [email protected]
> ---
> arch/arm/include/asm/device.h | 1 +
> arch/arm/include/asm/dma-mapping.h | 6 ++++++
> 2 files changed, 7 insertions(+)
>
> diff --git a/arch/arm/include/asm/device.h b/arch/arm/include/asm/device.h
> index dc662fc..4111592 100644
> --- a/arch/arm/include/asm/device.h
> +++ b/arch/arm/include/asm/device.h
> @@ -17,6 +17,7 @@ struct dev_archdata {
> #ifdef CONFIG_ARM_DMA_USE_IOMMU
> struct dma_iommu_mapping *mapping;
> #endif
> + bool dma_coherent;
> };
>
> struct omap_device;
> diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h
> index 85738b2..8c3b616 100644
> --- a/arch/arm/include/asm/dma-mapping.h
> +++ b/arch/arm/include/asm/dma-mapping.h
> @@ -123,11 +123,17 @@ static inline unsigned long dma_max_pfn(struct device *dev)
>
> static inline int set_arch_dma_coherent_ops(struct device *dev)
> {
> + dev->archdata.dma_coherent = true;
> set_dma_ops(dev, &arm_coherent_dma_ops);
> return 0;
> }
> #define set_arch_dma_coherent_ops(dev) set_arch_dma_coherent_ops(dev)
>
> +static inline bool is_device_dma_coherent(struct device *dev)
> +{
> + return dev->archdata.dma_coherent;
> +}
> +
> static inline dma_addr_t phys_to_dma(struct device *dev, phys_addr_t paddr)
> {
> unsigned int offset = paddr & ~PAGE_MASK;
> --
> 1.7.10.4
>

2014-11-18 16:49:51

by Stefano Stabellini

[permalink] [raw]
Subject: Re: [PATCH v9 05/13] arm: introduce is_device_dma_coherent

ping?

On Fri, 14 Nov 2014, Stefano Stabellini wrote:
> Russell,
> this patch needs your feedback.
>
> - Stefano
>
> On Wed, 12 Nov 2014, Stefano Stabellini wrote:
> > Introduce a boolean flag and an accessor function to check whether a
> > device is dma_coherent. Set the flag from set_arch_dma_coherent_ops.
> >
> > Signed-off-by: Stefano Stabellini <[email protected]>
> > Signed-off-by: Catalin Marinas <[email protected]>
> > Reviewed-by: Catalin Marinas <[email protected]>
> > CC: [email protected]
> > ---
> > arch/arm/include/asm/device.h | 1 +
> > arch/arm/include/asm/dma-mapping.h | 6 ++++++
> > 2 files changed, 7 insertions(+)
> >
> > diff --git a/arch/arm/include/asm/device.h b/arch/arm/include/asm/device.h
> > index dc662fc..4111592 100644
> > --- a/arch/arm/include/asm/device.h
> > +++ b/arch/arm/include/asm/device.h
> > @@ -17,6 +17,7 @@ struct dev_archdata {
> > #ifdef CONFIG_ARM_DMA_USE_IOMMU
> > struct dma_iommu_mapping *mapping;
> > #endif
> > + bool dma_coherent;
> > };
> >
> > struct omap_device;
> > diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h
> > index 85738b2..8c3b616 100644
> > --- a/arch/arm/include/asm/dma-mapping.h
> > +++ b/arch/arm/include/asm/dma-mapping.h
> > @@ -123,11 +123,17 @@ static inline unsigned long dma_max_pfn(struct device *dev)
> >
> > static inline int set_arch_dma_coherent_ops(struct device *dev)
> > {
> > + dev->archdata.dma_coherent = true;
> > set_dma_ops(dev, &arm_coherent_dma_ops);
> > return 0;
> > }
> > #define set_arch_dma_coherent_ops(dev) set_arch_dma_coherent_ops(dev)
> >
> > +static inline bool is_device_dma_coherent(struct device *dev)
> > +{
> > + return dev->archdata.dma_coherent;
> > +}
> > +
> > static inline dma_addr_t phys_to_dma(struct device *dev, phys_addr_t paddr)
> > {
> > unsigned int offset = paddr & ~PAGE_MASK;
> > --
> > 1.7.10.4
> >
>

2014-11-19 21:07:08

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: Re: [PATCH v9 13/13] swiotlb-xen: remove BUG_ON in xen_bus_to_phys

On Wed, Nov 12, 2014 at 11:40:54AM +0000, Stefano Stabellini wrote:
> On x86 truncation cannot occur because config XEN depends on X86_64 ||
> (X86_32 && X86_PAE).
>
> On ARM truncation can occur without CONFIG_ARM_LPAE, when the dma
> operation involves foreign grants. However in that case the physical
> address returned by xen_bus_to_phys is actually invalid (there is no mfn
> to pfn tracking for foreign grants on ARM) and it is not used.
>
> Signed-off-by: Stefano Stabellini <[email protected]>
> Reviewed-by: Catalin Marinas <[email protected]>

Acked-by: Konrad Rzeszutek Wilk <[email protected]>
> ---
> drivers/xen/swiotlb-xen.c | 2 --
> 1 file changed, 2 deletions(-)
>
> diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
> index 498b654..153cf14 100644
> --- a/drivers/xen/swiotlb-xen.c
> +++ b/drivers/xen/swiotlb-xen.c
> @@ -96,8 +96,6 @@ static inline phys_addr_t xen_bus_to_phys(dma_addr_t baddr)
> dma_addr_t dma = (dma_addr_t)pfn << PAGE_SHIFT;
> phys_addr_t paddr = dma;
>
> - BUG_ON(paddr != dma); /* truncation has occurred, should never happen */
> -
> paddr |= baddr & ~PAGE_MASK;
>
> return paddr;
> --
> 1.7.10.4
>

2014-11-19 21:07:56

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: Re: [PATCH v9 12/13] swiotlb-xen: pass dev_addr to xen_dma_unmap_page and xen_dma_sync_single_for_cpu

On Wed, Nov 12, 2014 at 11:40:53AM +0000, Stefano Stabellini wrote:
> xen_dma_unmap_page and xen_dma_sync_single_for_cpu take a dma_addr_t
> handle as argument, not a physical address.

Ouch. Should this also go on stable tree?

>
> Signed-off-by: Stefano Stabellini <[email protected]>
> Reviewed-by: Catalin Marinas <[email protected]>
> ---
> drivers/xen/swiotlb-xen.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
> index 3725ee4..498b654 100644
> --- a/drivers/xen/swiotlb-xen.c
> +++ b/drivers/xen/swiotlb-xen.c
> @@ -449,7 +449,7 @@ static void xen_unmap_single(struct device *hwdev, dma_addr_t dev_addr,
>
> BUG_ON(dir == DMA_NONE);
>
> - xen_dma_unmap_page(hwdev, paddr, size, dir, attrs);
> + xen_dma_unmap_page(hwdev, dev_addr, size, dir, attrs);
>
> /* NOTE: We use dev_addr here, not paddr! */
> if (is_xen_swiotlb_buffer(dev_addr)) {
> @@ -497,14 +497,14 @@ xen_swiotlb_sync_single(struct device *hwdev, dma_addr_t dev_addr,
> BUG_ON(dir == DMA_NONE);
>
> if (target == SYNC_FOR_CPU)
> - xen_dma_sync_single_for_cpu(hwdev, paddr, size, dir);
> + xen_dma_sync_single_for_cpu(hwdev, dev_addr, size, dir);
>
> /* NOTE: We use dev_addr here, not paddr! */
> if (is_xen_swiotlb_buffer(dev_addr))
> swiotlb_tbl_sync_single(hwdev, paddr, size, dir, target);
>
> if (target == SYNC_FOR_DEVICE)
> - xen_dma_sync_single_for_cpu(hwdev, paddr, size, dir);
> + xen_dma_sync_single_for_cpu(hwdev, dev_addr, size, dir);
>
> if (dir != DMA_FROM_DEVICE)
> return;
> --
> 1.7.10.4
>

2014-11-20 00:07:56

by Russell King - ARM Linux

[permalink] [raw]
Subject: Re: [PATCH v9 05/13] arm: introduce is_device_dma_coherent

On Tue, Nov 18, 2014 at 04:49:21PM +0000, Stefano Stabellini wrote:
> ping?

Sending something which wants my attention _To:_ me is always a good idea :)

The patch is fine in itself, but I have a niggle about the
is_device_dma_coherent() - provided this is only used in architecture
specific code, that should be fine. It could probably do with a comment
to that effect in an attempt to discourage drivers using it (thereby
becoming less portable to other architectures.)

--
FTTC broadband for 0.8mile line: currently at 9.5Mbps down 400kbps up
according to speedtest.net.

2014-11-20 10:48:48

by Stefano Stabellini

[permalink] [raw]
Subject: Re: [PATCH v9 12/13] swiotlb-xen: pass dev_addr to xen_dma_unmap_page and xen_dma_sync_single_for_cpu

On Thu, 20 Nov 2014, Stefano Stabellini wrote:
> On Wed, 19 Nov 2014, Konrad Rzeszutek Wilk wrote:
> > On Wed, Nov 12, 2014 at 11:40:53AM +0000, Stefano Stabellini wrote:
> > > xen_dma_unmap_page and xen_dma_sync_single_for_cpu take a dma_addr_t
> > > handle as argument, not a physical address.
> >
> > Ouch. Should this also go on stable tree?
>
> Good idea

Also can I take that as an Acked-by for this patch?


There is one last bit of common and x86 changes in this series:
patch #7 adds a paramter to xen_dma_map_page, even though the x86
implementation is empty:

http://marc.info/?l=xen-devel&m=141579253829808&w=2

is that OK for you?


> > > Signed-off-by: Stefano Stabellini <[email protected]>
> > > Reviewed-by: Catalin Marinas <[email protected]>
> > > ---
> > > drivers/xen/swiotlb-xen.c | 6 +++---
> > > 1 file changed, 3 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
> > > index 3725ee4..498b654 100644
> > > --- a/drivers/xen/swiotlb-xen.c
> > > +++ b/drivers/xen/swiotlb-xen.c
> > > @@ -449,7 +449,7 @@ static void xen_unmap_single(struct device *hwdev, dma_addr_t dev_addr,
> > >
> > > BUG_ON(dir == DMA_NONE);
> > >
> > > - xen_dma_unmap_page(hwdev, paddr, size, dir, attrs);
> > > + xen_dma_unmap_page(hwdev, dev_addr, size, dir, attrs);
> > >
> > > /* NOTE: We use dev_addr here, not paddr! */
> > > if (is_xen_swiotlb_buffer(dev_addr)) {
> > > @@ -497,14 +497,14 @@ xen_swiotlb_sync_single(struct device *hwdev, dma_addr_t dev_addr,
> > > BUG_ON(dir == DMA_NONE);
> > >
> > > if (target == SYNC_FOR_CPU)
> > > - xen_dma_sync_single_for_cpu(hwdev, paddr, size, dir);
> > > + xen_dma_sync_single_for_cpu(hwdev, dev_addr, size, dir);
> > >
> > > /* NOTE: We use dev_addr here, not paddr! */
> > > if (is_xen_swiotlb_buffer(dev_addr))
> > > swiotlb_tbl_sync_single(hwdev, paddr, size, dir, target);
> > >
> > > if (target == SYNC_FOR_DEVICE)
> > > - xen_dma_sync_single_for_cpu(hwdev, paddr, size, dir);
> > > + xen_dma_sync_single_for_cpu(hwdev, dev_addr, size, dir);
> > >
> > > if (dir != DMA_FROM_DEVICE)
> > > return;
> > > --
> > > 1.7.10.4
> > >
> >
>

2014-11-20 10:49:04

by Stefano Stabellini

[permalink] [raw]
Subject: Re: [PATCH v9 05/13] arm: introduce is_device_dma_coherent

On Thu, 20 Nov 2014, Russell King - ARM Linux wrote:
> On Tue, Nov 18, 2014 at 04:49:21PM +0000, Stefano Stabellini wrote:
> > ping?
>
> Sending something which wants my attention _To:_ me is always a good idea :)

I'll keep that in mind :-)


> The patch is fine in itself, but I have a niggle about the
> is_device_dma_coherent() - provided this is only used in architecture
> specific code, that should be fine. It could probably do with a comment
> to that effect in an attempt to discourage drivers using it (thereby
> becoming less portable to other architectures.)

Can I take this as an Acked-by?
Good point regarding the comment. Maybe I can add:

/* do not use this function in a driver */

2014-11-20 10:49:59

by Stefano Stabellini

[permalink] [raw]
Subject: Re: [PATCH v9 12/13] swiotlb-xen: pass dev_addr to xen_dma_unmap_page and xen_dma_sync_single_for_cpu

On Wed, 19 Nov 2014, Konrad Rzeszutek Wilk wrote:
> On Wed, Nov 12, 2014 at 11:40:53AM +0000, Stefano Stabellini wrote:
> > xen_dma_unmap_page and xen_dma_sync_single_for_cpu take a dma_addr_t
> > handle as argument, not a physical address.
>
> Ouch. Should this also go on stable tree?

Good idea


> >
> > Signed-off-by: Stefano Stabellini <[email protected]>
> > Reviewed-by: Catalin Marinas <[email protected]>
> > ---
> > drivers/xen/swiotlb-xen.c | 6 +++---
> > 1 file changed, 3 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
> > index 3725ee4..498b654 100644
> > --- a/drivers/xen/swiotlb-xen.c
> > +++ b/drivers/xen/swiotlb-xen.c
> > @@ -449,7 +449,7 @@ static void xen_unmap_single(struct device *hwdev, dma_addr_t dev_addr,
> >
> > BUG_ON(dir == DMA_NONE);
> >
> > - xen_dma_unmap_page(hwdev, paddr, size, dir, attrs);
> > + xen_dma_unmap_page(hwdev, dev_addr, size, dir, attrs);
> >
> > /* NOTE: We use dev_addr here, not paddr! */
> > if (is_xen_swiotlb_buffer(dev_addr)) {
> > @@ -497,14 +497,14 @@ xen_swiotlb_sync_single(struct device *hwdev, dma_addr_t dev_addr,
> > BUG_ON(dir == DMA_NONE);
> >
> > if (target == SYNC_FOR_CPU)
> > - xen_dma_sync_single_for_cpu(hwdev, paddr, size, dir);
> > + xen_dma_sync_single_for_cpu(hwdev, dev_addr, size, dir);
> >
> > /* NOTE: We use dev_addr here, not paddr! */
> > if (is_xen_swiotlb_buffer(dev_addr))
> > swiotlb_tbl_sync_single(hwdev, paddr, size, dir, target);
> >
> > if (target == SYNC_FOR_DEVICE)
> > - xen_dma_sync_single_for_cpu(hwdev, paddr, size, dir);
> > + xen_dma_sync_single_for_cpu(hwdev, dev_addr, size, dir);
> >
> > if (dir != DMA_FROM_DEVICE)
> > return;
> > --
> > 1.7.10.4
> >
>

2014-11-20 20:31:01

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: Re: [PATCH v9 12/13] swiotlb-xen: pass dev_addr to xen_dma_unmap_page and xen_dma_sync_single_for_cpu

On Thu, Nov 20, 2014 at 10:47:51AM +0000, Stefano Stabellini wrote:
> On Thu, 20 Nov 2014, Stefano Stabellini wrote:
> > On Wed, 19 Nov 2014, Konrad Rzeszutek Wilk wrote:
> > > On Wed, Nov 12, 2014 at 11:40:53AM +0000, Stefano Stabellini wrote:
> > > > xen_dma_unmap_page and xen_dma_sync_single_for_cpu take a dma_addr_t
> > > > handle as argument, not a physical address.
> > >
> > > Ouch. Should this also go on stable tree?
> >
> > Good idea
>
> Also can I take that as an Acked-by for this patch?

Yes.
>
>
> There is one last bit of common and x86 changes in this series:
> patch #7 adds a paramter to xen_dma_map_page, even though the x86
> implementation is empty:
>
> http://marc.info/?l=xen-devel&m=141579253829808&w=2
>
> is that OK for you?

Yes.
>
>
> > > > Signed-off-by: Stefano Stabellini <[email protected]>
> > > > Reviewed-by: Catalin Marinas <[email protected]>
> > > > ---
> > > > drivers/xen/swiotlb-xen.c | 6 +++---
> > > > 1 file changed, 3 insertions(+), 3 deletions(-)
> > > >
> > > > diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
> > > > index 3725ee4..498b654 100644
> > > > --- a/drivers/xen/swiotlb-xen.c
> > > > +++ b/drivers/xen/swiotlb-xen.c
> > > > @@ -449,7 +449,7 @@ static void xen_unmap_single(struct device *hwdev, dma_addr_t dev_addr,
> > > >
> > > > BUG_ON(dir == DMA_NONE);
> > > >
> > > > - xen_dma_unmap_page(hwdev, paddr, size, dir, attrs);
> > > > + xen_dma_unmap_page(hwdev, dev_addr, size, dir, attrs);
> > > >
> > > > /* NOTE: We use dev_addr here, not paddr! */
> > > > if (is_xen_swiotlb_buffer(dev_addr)) {
> > > > @@ -497,14 +497,14 @@ xen_swiotlb_sync_single(struct device *hwdev, dma_addr_t dev_addr,
> > > > BUG_ON(dir == DMA_NONE);
> > > >
> > > > if (target == SYNC_FOR_CPU)
> > > > - xen_dma_sync_single_for_cpu(hwdev, paddr, size, dir);
> > > > + xen_dma_sync_single_for_cpu(hwdev, dev_addr, size, dir);
> > > >
> > > > /* NOTE: We use dev_addr here, not paddr! */
> > > > if (is_xen_swiotlb_buffer(dev_addr))
> > > > swiotlb_tbl_sync_single(hwdev, paddr, size, dir, target);
> > > >
> > > > if (target == SYNC_FOR_DEVICE)
> > > > - xen_dma_sync_single_for_cpu(hwdev, paddr, size, dir);
> > > > + xen_dma_sync_single_for_cpu(hwdev, dev_addr, size, dir);
> > > >
> > > > if (dir != DMA_FROM_DEVICE)
> > > > return;
> > > > --
> > > > 1.7.10.4
> > > >
> > >
> >

2014-11-21 15:29:43

by Russell King - ARM Linux

[permalink] [raw]
Subject: Re: [PATCH v9 05/13] arm: introduce is_device_dma_coherent

On Thu, Nov 20, 2014 at 10:39:32AM +0000, Stefano Stabellini wrote:
> On Thu, 20 Nov 2014, Russell King - ARM Linux wrote:
> > On Tue, Nov 18, 2014 at 04:49:21PM +0000, Stefano Stabellini wrote:
> > > ping?
> >
> > Sending something which wants my attention _To:_ me is always a good idea :)
>
> I'll keep that in mind :-)
>
>
> > The patch is fine in itself, but I have a niggle about the
> > is_device_dma_coherent() - provided this is only used in architecture
> > specific code, that should be fine. It could probably do with a comment
> > to that effect in an attempt to discourage drivers using it (thereby
> > becoming less portable to other architectures.)
>
> Can I take this as an Acked-by?
> Good point regarding the comment. Maybe I can add:
>
> /* do not use this function in a driver */

Provided it gets such a comment, yes. Thanks.

--
FTTC broadband for 0.8mile line: currently at 9.5Mbps down 400kbps up
according to speedtest.net.

2014-11-21 16:31:51

by Stefano Stabellini

[permalink] [raw]
Subject: Re: [PATCH v9 10/13] xen/arm/arm64: introduce xen_arch_need_swiotlb

On Wed, 12 Nov 2014, Stefano Stabellini wrote:
> Introduce an arch specific function to find out whether a particular dma
> mapping operation needs to bounce on the swiotlb buffer.
>
> On ARM and ARM64, if the page involved is a foreign page and the device
> is not coherent, we need to bounce because at unmap time we cannot
> execute any required cache maintenance operations (we don't know how to
> find the pfn from the mfn).
>
> No change of behaviour for x86.
>
> Signed-off-by: Stefano Stabellini <[email protected]>
> Reviewed-by: David Vrabel <[email protected]>
> Reviewed-by: Catalin Marinas <[email protected]>
> Acked-by: Ian Campbell <[email protected]>
> Acked-by: Konrad Rzeszutek Wilk <[email protected]>

I am thinking of asking a backport of this patch to 3.16+

The catch is that is_device_dma_coherent is not available on older
kernels, so I'll have to change the arm implementation of
xen_arch_need_swiotlb to:

+bool xen_arch_need_swiotlb(struct device *dev,
+ unsigned long pfn,
+ unsigned long mfn)
+{
+ return pfn != mfn;
+}
+

It is going to make things slower but it is going to fix the issue with
cache flushing buffers for non-coherent devices.

Konrad, are you OK with that?


> Changes in v6:
> - fix ts.
>
> Changes in v5:
> - fix indentation.
> ---
> arch/arm/include/asm/xen/page.h | 4 ++++
> arch/arm/xen/mm.c | 7 +++++++
> arch/x86/include/asm/xen/page.h | 7 +++++++
> drivers/xen/swiotlb-xen.c | 5 ++++-
> 4 files changed, 22 insertions(+), 1 deletion(-)
>
> diff --git a/arch/arm/include/asm/xen/page.h b/arch/arm/include/asm/xen/page.h
> index 135c24a..68c739b 100644
> --- a/arch/arm/include/asm/xen/page.h
> +++ b/arch/arm/include/asm/xen/page.h
> @@ -107,4 +107,8 @@ static inline bool set_phys_to_machine(unsigned long pfn, unsigned long mfn)
> #define xen_remap(cookie, size) ioremap_cache((cookie), (size))
> #define xen_unmap(cookie) iounmap((cookie))
>
> +bool xen_arch_need_swiotlb(struct device *dev,
> + unsigned long pfn,
> + unsigned long mfn);
> +
> #endif /* _ASM_ARM_XEN_PAGE_H */
> diff --git a/arch/arm/xen/mm.c b/arch/arm/xen/mm.c
> index ab700e1..28ebf3e 100644
> --- a/arch/arm/xen/mm.c
> +++ b/arch/arm/xen/mm.c
> @@ -100,6 +100,13 @@ void __xen_dma_sync_single_for_device(struct device *hwdev,
> __xen_dma_page_cpu_to_dev(hwdev, handle, size, dir);
> }
>
> +bool xen_arch_need_swiotlb(struct device *dev,
> + unsigned long pfn,
> + unsigned long mfn)
> +{
> + return ((pfn != mfn) && !is_device_dma_coherent(dev));
> +}
> +
> int xen_create_contiguous_region(phys_addr_t pstart, unsigned int order,
> unsigned int address_bits,
> dma_addr_t *dma_handle)
> diff --git a/arch/x86/include/asm/xen/page.h b/arch/x86/include/asm/xen/page.h
> index c949923..f58ef6c 100644
> --- a/arch/x86/include/asm/xen/page.h
> +++ b/arch/x86/include/asm/xen/page.h
> @@ -236,4 +236,11 @@ void make_lowmem_page_readwrite(void *vaddr);
> #define xen_remap(cookie, size) ioremap((cookie), (size));
> #define xen_unmap(cookie) iounmap((cookie))
>
> +static inline bool xen_arch_need_swiotlb(struct device *dev,
> + unsigned long pfn,
> + unsigned long mfn)
> +{
> + return false;
> +}
> +
> #endif /* _ASM_X86_XEN_PAGE_H */
> diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
> index ad2c5eb..3725ee4 100644
> --- a/drivers/xen/swiotlb-xen.c
> +++ b/drivers/xen/swiotlb-xen.c
> @@ -399,7 +399,9 @@ dma_addr_t xen_swiotlb_map_page(struct device *dev, struct page *page,
> * buffering it.
> */
> if (dma_capable(dev, dev_addr, size) &&
> - !range_straddles_page_boundary(phys, size) && !swiotlb_force) {
> + !range_straddles_page_boundary(phys, size) &&
> + !xen_arch_need_swiotlb(dev, PFN_DOWN(phys), PFN_DOWN(dev_addr)) &&
> + !swiotlb_force) {
> /* we are not interested in the dma_addr returned by
> * xen_dma_map_page, only in the potential cache flushes executed
> * by the function. */
> @@ -557,6 +559,7 @@ xen_swiotlb_map_sg_attrs(struct device *hwdev, struct scatterlist *sgl,
> dma_addr_t dev_addr = xen_phys_to_bus(paddr);
>
> if (swiotlb_force ||
> + xen_arch_need_swiotlb(hwdev, PFN_DOWN(paddr), PFN_DOWN(dev_addr)) ||
> !dma_capable(hwdev, dev_addr, sg->length) ||
> range_straddles_page_boundary(paddr, sg->length)) {
> phys_addr_t map = swiotlb_tbl_map_single(hwdev,
> --
> 1.7.10.4
>

2014-11-21 16:48:52

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: Re: [PATCH v9 10/13] xen/arm/arm64: introduce xen_arch_need_swiotlb

On Fri, Nov 21, 2014 at 04:31:31PM +0000, Stefano Stabellini wrote:
> On Wed, 12 Nov 2014, Stefano Stabellini wrote:
> > Introduce an arch specific function to find out whether a particular dma
> > mapping operation needs to bounce on the swiotlb buffer.
> >
> > On ARM and ARM64, if the page involved is a foreign page and the device
> > is not coherent, we need to bounce because at unmap time we cannot
> > execute any required cache maintenance operations (we don't know how to
> > find the pfn from the mfn).
> >
> > No change of behaviour for x86.
> >
> > Signed-off-by: Stefano Stabellini <[email protected]>
> > Reviewed-by: David Vrabel <[email protected]>
> > Reviewed-by: Catalin Marinas <[email protected]>
> > Acked-by: Ian Campbell <[email protected]>
> > Acked-by: Konrad Rzeszutek Wilk <[email protected]>
>
> I am thinking of asking a backport of this patch to 3.16+
>
> The catch is that is_device_dma_coherent is not available on older
> kernels, so I'll have to change the arm implementation of
> xen_arch_need_swiotlb to:
>
> +bool xen_arch_need_swiotlb(struct device *dev,
> + unsigned long pfn,
> + unsigned long mfn)
> +{
> + return pfn != mfn;
> +}
> +
>
> It is going to make things slower but it is going to fix the issue with
> cache flushing buffers for non-coherent devices.
>
> Konrad, are you OK with that?

Looks fine.
>
>
> > Changes in v6:
> > - fix ts.
> >
> > Changes in v5:
> > - fix indentation.
> > ---
> > arch/arm/include/asm/xen/page.h | 4 ++++
> > arch/arm/xen/mm.c | 7 +++++++
> > arch/x86/include/asm/xen/page.h | 7 +++++++
> > drivers/xen/swiotlb-xen.c | 5 ++++-
> > 4 files changed, 22 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/arm/include/asm/xen/page.h b/arch/arm/include/asm/xen/page.h
> > index 135c24a..68c739b 100644
> > --- a/arch/arm/include/asm/xen/page.h
> > +++ b/arch/arm/include/asm/xen/page.h
> > @@ -107,4 +107,8 @@ static inline bool set_phys_to_machine(unsigned long pfn, unsigned long mfn)
> > #define xen_remap(cookie, size) ioremap_cache((cookie), (size))
> > #define xen_unmap(cookie) iounmap((cookie))
> >
> > +bool xen_arch_need_swiotlb(struct device *dev,
> > + unsigned long pfn,
> > + unsigned long mfn);
> > +
> > #endif /* _ASM_ARM_XEN_PAGE_H */
> > diff --git a/arch/arm/xen/mm.c b/arch/arm/xen/mm.c
> > index ab700e1..28ebf3e 100644
> > --- a/arch/arm/xen/mm.c
> > +++ b/arch/arm/xen/mm.c
> > @@ -100,6 +100,13 @@ void __xen_dma_sync_single_for_device(struct device *hwdev,
> > __xen_dma_page_cpu_to_dev(hwdev, handle, size, dir);
> > }
> >
> > +bool xen_arch_need_swiotlb(struct device *dev,
> > + unsigned long pfn,
> > + unsigned long mfn)
> > +{
> > + return ((pfn != mfn) && !is_device_dma_coherent(dev));
> > +}
> > +
> > int xen_create_contiguous_region(phys_addr_t pstart, unsigned int order,
> > unsigned int address_bits,
> > dma_addr_t *dma_handle)
> > diff --git a/arch/x86/include/asm/xen/page.h b/arch/x86/include/asm/xen/page.h
> > index c949923..f58ef6c 100644
> > --- a/arch/x86/include/asm/xen/page.h
> > +++ b/arch/x86/include/asm/xen/page.h
> > @@ -236,4 +236,11 @@ void make_lowmem_page_readwrite(void *vaddr);
> > #define xen_remap(cookie, size) ioremap((cookie), (size));
> > #define xen_unmap(cookie) iounmap((cookie))
> >
> > +static inline bool xen_arch_need_swiotlb(struct device *dev,
> > + unsigned long pfn,
> > + unsigned long mfn)
> > +{
> > + return false;
> > +}
> > +
> > #endif /* _ASM_X86_XEN_PAGE_H */
> > diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
> > index ad2c5eb..3725ee4 100644
> > --- a/drivers/xen/swiotlb-xen.c
> > +++ b/drivers/xen/swiotlb-xen.c
> > @@ -399,7 +399,9 @@ dma_addr_t xen_swiotlb_map_page(struct device *dev, struct page *page,
> > * buffering it.
> > */
> > if (dma_capable(dev, dev_addr, size) &&
> > - !range_straddles_page_boundary(phys, size) && !swiotlb_force) {
> > + !range_straddles_page_boundary(phys, size) &&
> > + !xen_arch_need_swiotlb(dev, PFN_DOWN(phys), PFN_DOWN(dev_addr)) &&
> > + !swiotlb_force) {
> > /* we are not interested in the dma_addr returned by
> > * xen_dma_map_page, only in the potential cache flushes executed
> > * by the function. */
> > @@ -557,6 +559,7 @@ xen_swiotlb_map_sg_attrs(struct device *hwdev, struct scatterlist *sgl,
> > dma_addr_t dev_addr = xen_phys_to_bus(paddr);
> >
> > if (swiotlb_force ||
> > + xen_arch_need_swiotlb(hwdev, PFN_DOWN(paddr), PFN_DOWN(dev_addr)) ||
> > !dma_capable(hwdev, dev_addr, sg->length) ||
> > range_straddles_page_boundary(paddr, sg->length)) {
> > phys_addr_t map = swiotlb_tbl_map_single(hwdev,
> > --
> > 1.7.10.4
> >