2014-10-27 15:11:56

by Stefano Stabellini

[permalink] [raw]
Subject: [PATCH v7 0/8] introduce GNTTABOP_cache_flush

Hi all,
this patch series introduces support for GNTTABOP_cache_flush to perform
cache maintenance operation on foreign pages and reverts the current
code based on XENFEAT_grant_map_identity.

It also provides a very slow fallback by bouncing on the swiotlb buffer,
in case the hypercall is not available.

v7 introduces a flag named dma_coherent in dev_archdata and an accessor
function named is_device_dma_coherent in asm/dma-mapping.h under arm and
arm64.


Changes in v7:
- rebased on 3.18-rc1;
- rename is_dma_coherent to is_device_dma_coherent and move it to
asm/dma-mapping.h on arm and arm64;
- introduce a dma_coherent flag on arm and arm64;
- set the flags from set_arch_dma_coherent_ops.

Changes in v6:
- rename xen_is_dma_coherent to is_dma_coherent;
- use of_dma_is_coherent to detect the dma coherency of a device;
- remove arch/arm64/include/asm/xen/page-coherent.h, include
../../arm/include/asm/xen/page-coherent.h instead;
- fix indentation.

Changes in v5:
- introduce xen_is_dma_coherent as a xen specific function;
- call xen_is_dma_coherent instead of is_dma_coherent;
- introduce xen_is_dma_coherent to
arch/arm64/include/asm/xen/page-coherent.h;
- do not remove arch/arm64/include/asm/xen/page-coherent.h, add
the missing dma_ops calls from it;
- fix indentation;
- rename hypercall_flush to hypercall_cflush;
- remove spurious change.

Changes in v4:
- remove outer_*_range call;
- introduce is_dma_coherent;
- use is_dma_coherent in arch/arm/xen/mm32.c;
- merge xen/mm32.c into xen/mm.c;
- xen/arm/arm64: introduce xen_arch_need_swiotlb;
- avoid bouncing dma map operations that involve foreign grants and
non-coherent devices if GNTTABOP_cache_flush is provided by Xen.

Changes in v3:
- fix the cache maintenance op call to match what Linux does natively;
- update the hypercall interface to match Xen.

Changes in v2:
- remove the definition of XENFEAT_grant_map_identity;
- update the hypercall interface to match Xen;
- call the interface on a single page at a time.


Stefano Stabellini (8):
xen/arm: remove handling of XENFEAT_grant_map_identity
xen/arm: remove outer_*_range call
arm64: introduce is_device_dma_coherent
arm: introduce is_device_dma_coherent
xen/arm: use is_device_dma_coherent
xen/arm/arm64: merge xen/mm32.c into xen/mm.c
xen/arm/arm64: introduce xen_arch_need_swiotlb
xen/arm: introduce GNTTABOP_cache_flush

arch/arm/include/asm/device.h | 1 +
arch/arm/include/asm/dma-mapping.h | 6 +
arch/arm/include/asm/xen/page.h | 4 +
arch/arm/xen/Makefile | 2 +-
arch/arm/xen/enlighten.c | 5 -
arch/arm/xen/mm.c | 159 ++++++++++++++++++++++
arch/arm/xen/mm32.c | 202 ----------------------------
arch/arm64/include/asm/device.h | 1 +
arch/arm64/include/asm/dma-mapping.h | 6 +
arch/arm64/include/asm/xen/page-coherent.h | 44 +-----
arch/x86/include/asm/xen/page.h | 7 +
drivers/xen/swiotlb-xen.c | 5 +-
include/xen/interface/features.h | 3 -
include/xen/interface/grant_table.h | 19 +++
14 files changed, 209 insertions(+), 255 deletions(-)
delete mode 100644 arch/arm/xen/mm32.c


2014-10-27 15:12:15

by Stefano Stabellini

[permalink] [raw]
Subject: [PATCH v7 6/8] xen/arm/arm64: merge xen/mm32.c into xen/mm.c

Merge xen/mm32.c into xen/mm.c.
As a consequence the code gets compiled on arm64 too: introduce a few
compat functions to actually be able to compile it.

Signed-off-by: Stefano Stabellini <[email protected]>
Acked-by: Ian Campbell <[email protected]>
---
arch/arm/xen/Makefile | 2 +-
arch/arm/xen/mm.c | 123 ++++++++++++++++++++++++++++
arch/arm/xen/mm32.c | 110 -------------------------
arch/arm64/include/asm/xen/page-coherent.h | 44 +---------
4 files changed, 125 insertions(+), 154 deletions(-)
delete mode 100644 arch/arm/xen/mm32.c

diff --git a/arch/arm/xen/Makefile b/arch/arm/xen/Makefile
index 1f85bfe..1296952 100644
--- a/arch/arm/xen/Makefile
+++ b/arch/arm/xen/Makefile
@@ -1 +1 @@
-obj-y := enlighten.o hypercall.o grant-table.o p2m.o mm.o mm32.o
+obj-y := enlighten.o hypercall.o grant-table.o p2m.o mm.o
diff --git a/arch/arm/xen/mm.c b/arch/arm/xen/mm.c
index b0e77de..ff413a8 100644
--- a/arch/arm/xen/mm.c
+++ b/arch/arm/xen/mm.c
@@ -1,6 +1,10 @@
+#include <linux/cpu.h>
+#include <linux/dma-mapping.h>
#include <linux/bootmem.h>
#include <linux/gfp.h>
+#include <linux/highmem.h>
#include <linux/export.h>
+#include <linux/of_address.h>
#include <linux/slab.h>
#include <linux/types.h>
#include <linux/dma-mapping.h>
@@ -16,6 +20,125 @@
#include <asm/xen/hypercall.h>
#include <asm/xen/interface.h>

+
+#ifdef CONFIG_ARM64
+static inline void dmac_map_area(const void *start, size_t size, int dir)
+{
+ return __dma_map_area(start, size, dir);
+}
+
+static inline void dmac_unmap_area(const void *start, size_t size, int dir)
+{
+ return __dma_unmap_area(start, size, dir);
+}
+
+static inline bool cache_is_vipt_nonaliasing(void)
+{
+ return true;
+}
+
+static inline void *kmap_high_get(struct page *page)
+{
+ return NULL;
+}
+
+static inline void kunmap_high(struct page *page) {}
+#endif
+
+/* functions called by SWIOTLB */
+
+static void dma_cache_maint(dma_addr_t handle, unsigned long offset,
+ size_t size, enum dma_data_direction dir,
+ void (*op)(const void *, size_t, int))
+{
+ unsigned long pfn;
+ size_t left = size;
+
+ pfn = (handle >> PAGE_SHIFT) + offset / PAGE_SIZE;
+ offset %= PAGE_SIZE;
+
+ do {
+ size_t len = left;
+ void *vaddr;
+
+ if (!pfn_valid(pfn))
+ {
+ /* TODO: cache flush */
+ } else {
+ struct page *page = pfn_to_page(pfn);
+
+ if (PageHighMem(page)) {
+ if (len + offset > PAGE_SIZE)
+ len = PAGE_SIZE - offset;
+
+ if (cache_is_vipt_nonaliasing()) {
+ vaddr = kmap_atomic(page);
+ op(vaddr + offset, len, dir);
+ kunmap_atomic(vaddr);
+ } else {
+ vaddr = kmap_high_get(page);
+ if (vaddr) {
+ op(vaddr + offset, len, dir);
+ kunmap_high(page);
+ }
+ }
+ } else {
+ vaddr = page_address(page) + offset;
+ op(vaddr, len, dir);
+ }
+ }
+
+ offset = 0;
+ pfn++;
+ left -= len;
+ } while (left);
+}
+
+static void __xen_dma_page_dev_to_cpu(struct device *hwdev, dma_addr_t handle,
+ size_t size, enum dma_data_direction dir)
+{
+ /* Cannot use __dma_page_dev_to_cpu because we don't have a
+ * struct page for handle */
+
+ dma_cache_maint(handle & PAGE_MASK, handle & ~PAGE_MASK, size, dir, dmac_unmap_area);
+}
+
+static void __xen_dma_page_cpu_to_dev(struct device *hwdev, dma_addr_t handle,
+ size_t size, enum dma_data_direction dir)
+{
+
+ dma_cache_maint(handle & PAGE_MASK, handle & ~PAGE_MASK, size, dir, dmac_map_area);
+}
+
+void xen_dma_unmap_page(struct device *hwdev, dma_addr_t handle,
+ size_t size, enum dma_data_direction dir,
+ struct dma_attrs *attrs)
+
+{
+ if (is_device_dma_coherent(hwdev))
+ return;
+ if (dma_get_attr(DMA_ATTR_SKIP_CPU_SYNC, attrs))
+ return;
+
+ __xen_dma_page_dev_to_cpu(hwdev, handle, size, dir);
+}
+
+void xen_dma_sync_single_for_cpu(struct device *hwdev,
+ dma_addr_t handle, size_t size, enum dma_data_direction dir)
+{
+ if (is_device_dma_coherent(hwdev))
+ return;
+ __xen_dma_page_dev_to_cpu(hwdev, handle, size, dir);
+}
+
+void xen_dma_sync_single_for_device(struct device *hwdev,
+ dma_addr_t handle, size_t size, enum dma_data_direction dir)
+{
+ if (is_device_dma_coherent(hwdev))
+ return;
+ __xen_dma_page_cpu_to_dev(hwdev, handle, size, dir);
+}
+
int xen_create_contiguous_region(phys_addr_t pstart, unsigned int order,
unsigned int address_bits,
dma_addr_t *dma_handle)
diff --git a/arch/arm/xen/mm32.c b/arch/arm/xen/mm32.c
deleted file mode 100644
index d8ed359..0000000
--- a/arch/arm/xen/mm32.c
+++ /dev/null
@@ -1,110 +0,0 @@
-#include <linux/cpu.h>
-#include <linux/dma-mapping.h>
-#include <linux/gfp.h>
-#include <linux/highmem.h>
-
-#include <xen/features.h>
-
-
-/* functions called by SWIOTLB */
-
-static void dma_cache_maint(dma_addr_t handle, unsigned long offset,
- size_t size, enum dma_data_direction dir,
- void (*op)(const void *, size_t, int))
-{
- unsigned long pfn;
- size_t left = size;
-
- pfn = (handle >> PAGE_SHIFT) + offset / PAGE_SIZE;
- offset %= PAGE_SIZE;
-
- do {
- size_t len = left;
- void *vaddr;
-
- if (!pfn_valid(pfn))
- {
- /* TODO: cache flush */
- } else {
- struct page *page = pfn_to_page(pfn);
-
- if (PageHighMem(page)) {
- if (len + offset > PAGE_SIZE)
- len = PAGE_SIZE - offset;
-
- if (cache_is_vipt_nonaliasing()) {
- vaddr = kmap_atomic(page);
- op(vaddr + offset, len, dir);
- kunmap_atomic(vaddr);
- } else {
- vaddr = kmap_high_get(page);
- if (vaddr) {
- op(vaddr + offset, len, dir);
- kunmap_high(page);
- }
- }
- } else {
- vaddr = page_address(page) + offset;
- op(vaddr, len, dir);
- }
- }
-
- offset = 0;
- pfn++;
- left -= len;
- } while (left);
-}
-
-static void __xen_dma_page_dev_to_cpu(struct device *hwdev, dma_addr_t handle,
- size_t size, enum dma_data_direction dir)
-{
- /* Cannot use __dma_page_dev_to_cpu because we don't have a
- * struct page for handle */
-
- dma_cache_maint(handle & PAGE_MASK, handle & ~PAGE_MASK, size, dir, dmac_unmap_area);
-}
-
-static void __xen_dma_page_cpu_to_dev(struct device *hwdev, dma_addr_t handle,
- size_t size, enum dma_data_direction dir)
-{
-
- dma_cache_maint(handle & PAGE_MASK, handle & ~PAGE_MASK, size, dir, dmac_map_area);
-}
-
-void xen_dma_unmap_page(struct device *hwdev, dma_addr_t handle,
- size_t size, enum dma_data_direction dir,
- struct dma_attrs *attrs)
-
-{
- if (is_device_dma_coherent(hwdev))
- return;
- if (dma_get_attr(DMA_ATTR_SKIP_CPU_SYNC, attrs))
- return;
-
- __xen_dma_page_dev_to_cpu(hwdev, handle, size, dir);
-}
-
-void xen_dma_sync_single_for_cpu(struct device *hwdev,
- dma_addr_t handle, size_t size, enum dma_data_direction dir)
-{
- if (is_device_dma_coherent(hwdev))
- return;
- __xen_dma_page_dev_to_cpu(hwdev, handle, size, dir);
-}
-
-void xen_dma_sync_single_for_device(struct device *hwdev,
- dma_addr_t handle, size_t size, enum dma_data_direction dir)
-{
- if (is_device_dma_coherent(hwdev))
- return;
- __xen_dma_page_cpu_to_dev(hwdev, handle, size, dir);
-}
-
-int __init xen_mm32_init(void)
-{
- if (!xen_initial_domain())
- return 0;
-
- return 0;
-}
-arch_initcall(xen_mm32_init);
diff --git a/arch/arm64/include/asm/xen/page-coherent.h b/arch/arm64/include/asm/xen/page-coherent.h
index dde3fc9..2052102 100644
--- a/arch/arm64/include/asm/xen/page-coherent.h
+++ b/arch/arm64/include/asm/xen/page-coherent.h
@@ -1,43 +1 @@
-#ifndef _ASM_ARM64_XEN_PAGE_COHERENT_H
-#define _ASM_ARM64_XEN_PAGE_COHERENT_H
-
-#include <asm/page.h>
-#include <linux/dma-attrs.h>
-#include <linux/dma-mapping.h>
-
-static inline void *xen_alloc_coherent_pages(struct device *hwdev, size_t size,
- dma_addr_t *dma_handle, gfp_t flags,
- struct dma_attrs *attrs)
-{
- return __generic_dma_ops(hwdev)->alloc(hwdev, size, dma_handle, flags, attrs);
-}
-
-static inline void xen_free_coherent_pages(struct device *hwdev, size_t size,
- void *cpu_addr, dma_addr_t dma_handle,
- struct dma_attrs *attrs)
-{
- __generic_dma_ops(hwdev)->free(hwdev, size, cpu_addr, dma_handle, attrs);
-}
-
-static inline void xen_dma_map_page(struct device *hwdev, struct page *page,
- unsigned long offset, size_t size, enum dma_data_direction dir,
- struct dma_attrs *attrs)
-{
-}
-
-static inline void xen_dma_unmap_page(struct device *hwdev, dma_addr_t handle,
- size_t size, enum dma_data_direction dir,
- struct dma_attrs *attrs)
-{
-}
-
-static inline void xen_dma_sync_single_for_cpu(struct device *hwdev,
- dma_addr_t handle, size_t size, enum dma_data_direction dir)
-{
-}
-
-static inline void xen_dma_sync_single_for_device(struct device *hwdev,
- dma_addr_t handle, size_t size, enum dma_data_direction dir)
-{
-}
-#endif /* _ASM_ARM64_XEN_PAGE_COHERENT_H */
+#include <../../arm/include/asm/xen/page-coherent.h>
--
1.7.10.4

2014-10-27 15:12:13

by Stefano Stabellini

[permalink] [raw]
Subject: [PATCH v7 7/8] xen/arm/arm64: introduce xen_arch_need_swiotlb

Introduce an arch specific function to find out whether a particular dma
mapping operation needs to bounce on the swiotlb buffer.

On ARM and ARM64, if the page involved is a foreign page and the device
is not coherent, we need to bounce because at unmap time we cannot
execute any required cache maintenance operations (we don't know how to
find the pfn from the mfn).

No change of behaviour for x86.

Signed-off-by: Stefano Stabellini <[email protected]>
Acked-by: Ian Campbell <[email protected]>

---

Changes in v6:
- fix ts.

Changes in v5:
- fix indentation.
---
arch/arm/include/asm/xen/page.h | 4 ++++
arch/arm/xen/mm.c | 7 +++++++
arch/x86/include/asm/xen/page.h | 7 +++++++
drivers/xen/swiotlb-xen.c | 5 ++++-
4 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/arch/arm/include/asm/xen/page.h b/arch/arm/include/asm/xen/page.h
index 135c24a..68c739b 100644
--- a/arch/arm/include/asm/xen/page.h
+++ b/arch/arm/include/asm/xen/page.h
@@ -107,4 +107,8 @@ static inline bool set_phys_to_machine(unsigned long pfn, unsigned long mfn)
#define xen_remap(cookie, size) ioremap_cache((cookie), (size))
#define xen_unmap(cookie) iounmap((cookie))

+bool xen_arch_need_swiotlb(struct device *dev,
+ unsigned long pfn,
+ unsigned long mfn);
+
#endif /* _ASM_ARM_XEN_PAGE_H */
diff --git a/arch/arm/xen/mm.c b/arch/arm/xen/mm.c
index ff413a8..28396aa 100644
--- a/arch/arm/xen/mm.c
+++ b/arch/arm/xen/mm.c
@@ -139,6 +139,13 @@ void xen_dma_sync_single_for_device(struct device *hwdev,
__xen_dma_page_cpu_to_dev(hwdev, handle, size, dir);
}

+bool xen_arch_need_swiotlb(struct device *dev,
+ unsigned long pfn,
+ unsigned long mfn)
+{
+ return ((pfn != mfn) && !is_device_dma_coherent(dev));
+}
+
int xen_create_contiguous_region(phys_addr_t pstart, unsigned int order,
unsigned int address_bits,
dma_addr_t *dma_handle)
diff --git a/arch/x86/include/asm/xen/page.h b/arch/x86/include/asm/xen/page.h
index c949923..f58ef6c 100644
--- a/arch/x86/include/asm/xen/page.h
+++ b/arch/x86/include/asm/xen/page.h
@@ -236,4 +236,11 @@ void make_lowmem_page_readwrite(void *vaddr);
#define xen_remap(cookie, size) ioremap((cookie), (size));
#define xen_unmap(cookie) iounmap((cookie))

+static inline bool xen_arch_need_swiotlb(struct device *dev,
+ unsigned long pfn,
+ unsigned long mfn)
+{
+ return false;
+}
+
#endif /* _ASM_X86_XEN_PAGE_H */
diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index ebd8f21..ac5d41b 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -399,7 +399,9 @@ dma_addr_t xen_swiotlb_map_page(struct device *dev, struct page *page,
* buffering it.
*/
if (dma_capable(dev, dev_addr, size) &&
- !range_straddles_page_boundary(phys, size) && !swiotlb_force) {
+ !range_straddles_page_boundary(phys, size) &&
+ !xen_arch_need_swiotlb(dev, PFN_DOWN(phys), PFN_DOWN(dev_addr)) &&
+ !swiotlb_force) {
/* we are not interested in the dma_addr returned by
* xen_dma_map_page, only in the potential cache flushes executed
* by the function. */
@@ -557,6 +559,7 @@ xen_swiotlb_map_sg_attrs(struct device *hwdev, struct scatterlist *sgl,
dma_addr_t dev_addr = xen_phys_to_bus(paddr);

if (swiotlb_force ||
+ xen_arch_need_swiotlb(hwdev, PFN_DOWN(paddr), PFN_DOWN(dev_addr)) ||
!dma_capable(hwdev, dev_addr, sg->length) ||
range_straddles_page_boundary(paddr, sg->length)) {
phys_addr_t map = swiotlb_tbl_map_single(hwdev,
--
1.7.10.4

2014-10-27 15:13:09

by Stefano Stabellini

[permalink] [raw]
Subject: [PATCH v7 2/8] xen/arm: remove outer_*_range call

Dom0 is not actually capable of issuing outer_inv_range or
outer_clean_range calls.

Signed-off-by: Stefano Stabellini <[email protected]>
Acked-by: Ian Campbell <[email protected]>
---
arch/arm/xen/mm32.c | 9 ---------
1 file changed, 9 deletions(-)

diff --git a/arch/arm/xen/mm32.c b/arch/arm/xen/mm32.c
index a5a93fc..6153d61 100644
--- a/arch/arm/xen/mm32.c
+++ b/arch/arm/xen/mm32.c
@@ -61,9 +61,6 @@ static void __xen_dma_page_dev_to_cpu(struct device *hwdev, dma_addr_t handle,
/* Cannot use __dma_page_dev_to_cpu because we don't have a
* struct page for handle */

- if (dir != DMA_TO_DEVICE)
- outer_inv_range(handle, handle + size);
-
dma_cache_maint(handle & PAGE_MASK, handle & ~PAGE_MASK, size, dir, dmac_unmap_area);
}

@@ -72,12 +69,6 @@ static void __xen_dma_page_cpu_to_dev(struct device *hwdev, dma_addr_t handle,
{

dma_cache_maint(handle & PAGE_MASK, handle & ~PAGE_MASK, size, dir, dmac_map_area);
-
- if (dir == DMA_FROM_DEVICE) {
- outer_inv_range(handle, handle + size);
- } else {
- outer_clean_range(handle, handle + size);
- }
}

void xen_dma_unmap_page(struct device *hwdev, dma_addr_t handle,
--
1.7.10.4

2014-10-27 15:13:12

by Stefano Stabellini

[permalink] [raw]
Subject: [PATCH v7 3/8] arm64: introduce is_device_dma_coherent

Introduce a boolean flag and an accessor function to check whether a
device is dma_coherent. Set the flag from set_arch_dma_coherent_ops.

Signed-off-by: Stefano Stabellini <[email protected]>
Signed-off-by: Catalin Marinas <[email protected]>
CC: [email protected]
---
arch/arm64/include/asm/device.h | 1 +
arch/arm64/include/asm/dma-mapping.h | 6 ++++++
2 files changed, 7 insertions(+)

diff --git a/arch/arm64/include/asm/device.h b/arch/arm64/include/asm/device.h
index cf98b36..243ef25 100644
--- a/arch/arm64/include/asm/device.h
+++ b/arch/arm64/include/asm/device.h
@@ -21,6 +21,7 @@ struct dev_archdata {
#ifdef CONFIG_IOMMU_API
void *iommu; /* private IOMMU data */
#endif
+ bool dma_coherent;
};

struct pdev_archdata {
diff --git a/arch/arm64/include/asm/dma-mapping.h b/arch/arm64/include/asm/dma-mapping.h
index adeae3f..e213dc4 100644
--- a/arch/arm64/include/asm/dma-mapping.h
+++ b/arch/arm64/include/asm/dma-mapping.h
@@ -54,11 +54,17 @@ static inline void set_dma_ops(struct device *dev, struct dma_map_ops *ops)

static inline int set_arch_dma_coherent_ops(struct device *dev)
{
+ dev->archdata.dma_coherent = true;
set_dma_ops(dev, &coherent_swiotlb_dma_ops);
return 0;
}
#define set_arch_dma_coherent_ops set_arch_dma_coherent_ops

+static inline bool is_device_dma_coherent(struct device *dev)
+{
+ return dev->archdata.dma_coherent;
+}
+
#include <asm-generic/dma-mapping-common.h>

static inline dma_addr_t phys_to_dma(struct device *dev, phys_addr_t paddr)
--
1.7.10.4

2014-10-27 15:13:31

by Stefano Stabellini

[permalink] [raw]
Subject: [PATCH v7 1/8] xen/arm: remove handling of XENFEAT_grant_map_identity

The feature has been removed from Xen. Also Linux cannot use it on ARM32
without CONFIG_ARM_LPAE.

Signed-off-by: Stefano Stabellini <[email protected]>
Reviewed-by: David Vrabel <[email protected]>
Acked-by: Ian Campbell <[email protected]>

---
Changes in v2:

- remove the definition of XENFEAT_grant_map_identity.
---
arch/arm/xen/enlighten.c | 5 ---
arch/arm/xen/mm32.c | 85 +-------------------------------------
include/xen/interface/features.h | 3 --
3 files changed, 1 insertion(+), 92 deletions(-)

diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c
index 0e15f01..c7ca936 100644
--- a/arch/arm/xen/enlighten.c
+++ b/arch/arm/xen/enlighten.c
@@ -261,11 +261,6 @@ static int __init xen_guest_init(void)

xen_setup_features();

- if (!xen_feature(XENFEAT_grant_map_identity)) {
- pr_warn("Please upgrade your Xen.\n"
- "If your platform has any non-coherent DMA devices, they won't work properly.\n");
- }
-
if (xen_feature(XENFEAT_dom0))
xen_start_info->flags |= SIF_INITDOMAIN|SIF_PRIVILEGED;
else
diff --git a/arch/arm/xen/mm32.c b/arch/arm/xen/mm32.c
index 3b99860..a5a93fc 100644
--- a/arch/arm/xen/mm32.c
+++ b/arch/arm/xen/mm32.c
@@ -5,70 +5,6 @@

#include <xen/features.h>

-static DEFINE_PER_CPU(unsigned long, xen_mm32_scratch_virt);
-static DEFINE_PER_CPU(pte_t *, xen_mm32_scratch_ptep);
-
-static int alloc_xen_mm32_scratch_page(int cpu)
-{
- struct page *page;
- unsigned long virt;
- pmd_t *pmdp;
- pte_t *ptep;
-
- if (per_cpu(xen_mm32_scratch_ptep, cpu) != NULL)
- return 0;
-
- page = alloc_page(GFP_KERNEL);
- if (page == NULL) {
- pr_warn("Failed to allocate xen_mm32_scratch_page for cpu %d\n", cpu);
- return -ENOMEM;
- }
-
- virt = (unsigned long)__va(page_to_phys(page));
- pmdp = pmd_offset(pud_offset(pgd_offset_k(virt), virt), virt);
- ptep = pte_offset_kernel(pmdp, virt);
-
- per_cpu(xen_mm32_scratch_virt, cpu) = virt;
- per_cpu(xen_mm32_scratch_ptep, cpu) = ptep;
-
- return 0;
-}
-
-static int xen_mm32_cpu_notify(struct notifier_block *self,
- unsigned long action, void *hcpu)
-{
- int cpu = (long)hcpu;
- switch (action) {
- case CPU_UP_PREPARE:
- if (alloc_xen_mm32_scratch_page(cpu))
- return NOTIFY_BAD;
- break;
- default:
- break;
- }
- return NOTIFY_OK;
-}
-
-static struct notifier_block xen_mm32_cpu_notifier = {
- .notifier_call = xen_mm32_cpu_notify,
-};
-
-static void* xen_mm32_remap_page(dma_addr_t handle)
-{
- unsigned long virt = get_cpu_var(xen_mm32_scratch_virt);
- pte_t *ptep = __get_cpu_var(xen_mm32_scratch_ptep);
-
- *ptep = pfn_pte(handle >> PAGE_SHIFT, PAGE_KERNEL);
- local_flush_tlb_kernel_page(virt);
-
- return (void*)virt;
-}
-
-static void xen_mm32_unmap(void *vaddr)
-{
- put_cpu_var(xen_mm32_scratch_virt);
-}
-

/* functions called by SWIOTLB */

@@ -88,13 +24,7 @@ static void dma_cache_maint(dma_addr_t handle, unsigned long offset,

if (!pfn_valid(pfn))
{
- /* Cannot map the page, we don't know its physical address.
- * Return and hope for the best */
- if (!xen_feature(XENFEAT_grant_map_identity))
- return;
- vaddr = xen_mm32_remap_page(handle) + offset;
- op(vaddr, len, dir);
- xen_mm32_unmap(vaddr - offset);
+ /* TODO: cache flush */
} else {
struct page *page = pfn_to_page(pfn);

@@ -181,22 +111,9 @@ void xen_dma_sync_single_for_device(struct device *hwdev,

int __init xen_mm32_init(void)
{
- int cpu;
-
if (!xen_initial_domain())
return 0;

- register_cpu_notifier(&xen_mm32_cpu_notifier);
- get_online_cpus();
- for_each_online_cpu(cpu) {
- if (alloc_xen_mm32_scratch_page(cpu)) {
- put_online_cpus();
- unregister_cpu_notifier(&xen_mm32_cpu_notifier);
- return -ENOMEM;
- }
- }
- put_online_cpus();
-
return 0;
}
arch_initcall(xen_mm32_init);
diff --git a/include/xen/interface/features.h b/include/xen/interface/features.h
index 14334d0..131a6cc 100644
--- a/include/xen/interface/features.h
+++ b/include/xen/interface/features.h
@@ -53,9 +53,6 @@
/* operation as Dom0 is supported */
#define XENFEAT_dom0 11

-/* Xen also maps grant references at pfn = mfn */
-#define XENFEAT_grant_map_identity 12
-
#define XENFEAT_NR_SUBMAPS 1

#endif /* __XEN_PUBLIC_FEATURES_H__ */
--
1.7.10.4

2014-10-27 15:14:00

by Stefano Stabellini

[permalink] [raw]
Subject: [PATCH v7 8/8] xen/arm: introduce GNTTABOP_cache_flush

Introduce support for new hypercall GNTTABOP_cache_flush.
Use it to perform cache flashing on pages used for dma when necessary.

If GNTTABOP_cache_flush is supported by the hypervisor, we don't need to
bounce dma map operations that involve foreign grants and non-coherent
devices.

Signed-off-by: Stefano Stabellini <[email protected]>
Acked-by: Ian Campbell <[email protected]>

---

Changes in v5:
- rename hypercall_flush to hypercall_cflush;
- remove spurious change.

Changes in v4:
- add comment;
- avoid bouncing dma map operations that involve foreign grants and
non-coherent devices if GNTTABOP_cache_flush is provided by Xen.

Changes in v3:
- fix the cache maintenance op call to match what Linux does natively;
- update the hypercall interface to match Xen.

Changes in v2:
- update the hypercall interface to match Xen;
- call the interface on a single page at a time.
---
arch/arm/xen/mm.c | 39 ++++++++++++++++++++++++++++++-----
include/xen/interface/grant_table.h | 19 +++++++++++++++++
2 files changed, 53 insertions(+), 5 deletions(-)

diff --git a/arch/arm/xen/mm.c b/arch/arm/xen/mm.c
index 28396aa..e20a57f 100644
--- a/arch/arm/xen/mm.c
+++ b/arch/arm/xen/mm.c
@@ -12,6 +12,7 @@
#include <linux/swiotlb.h>

#include <xen/xen.h>
+#include <xen/interface/grant_table.h>
#include <xen/interface/memory.h>
#include <xen/swiotlb-xen.h>

@@ -20,6 +21,8 @@
#include <asm/xen/hypercall.h>
#include <asm/xen/interface.h>

+static bool hypercall_cflush = false;
+

#ifdef CONFIG_ARM64
static inline void dmac_map_area(const void *start, size_t size, int dir)
@@ -60,17 +63,35 @@ static void dma_cache_maint(dma_addr_t handle, unsigned long offset,
do {
size_t len = left;
void *vaddr;
+
+ /* buffers in highmem or foreign pages cannot cross page
+ * boundaries */
+ if (len + offset > PAGE_SIZE)
+ len = PAGE_SIZE - offset;

if (!pfn_valid(pfn))
{
- /* TODO: cache flush */
+ struct gnttab_cache_flush cflush;
+
+ cflush.op = 0;
+ cflush.a.dev_bus_addr = pfn << PAGE_SHIFT;
+ cflush.offset = offset;
+ cflush.length = len;
+
+ if (op == dmac_unmap_area && dir != DMA_TO_DEVICE)
+ cflush.op = GNTTAB_CACHE_INVAL;
+ if (op == dmac_map_area) {
+ if (dir == DMA_FROM_DEVICE)
+ cflush.op = GNTTAB_CACHE_INVAL;
+ else
+ cflush.op = GNTTAB_CACHE_CLEAN;
+ }
+ if (cflush.op)
+ HYPERVISOR_grant_table_op(GNTTABOP_cache_flush, &cflush, 1);
} else {
struct page *page = pfn_to_page(pfn);

if (PageHighMem(page)) {
- if (len + offset > PAGE_SIZE)
- len = PAGE_SIZE - offset;
-
if (cache_is_vipt_nonaliasing()) {
vaddr = kmap_atomic(page);
op(vaddr + offset, len, dir);
@@ -143,7 +164,7 @@ bool xen_arch_need_swiotlb(struct device *dev,
unsigned long pfn,
unsigned long mfn)
{
- return ((pfn != mfn) && !is_device_dma_coherent(dev));
+ return (!hypercall_cflush && (pfn != mfn) && !is_device_dma_coherent(dev));
}

int xen_create_contiguous_region(phys_addr_t pstart, unsigned int order,
@@ -186,10 +207,18 @@ static struct dma_map_ops xen_swiotlb_dma_ops = {

int __init xen_mm_init(void)
{
+ struct gnttab_cache_flush cflush;
if (!xen_initial_domain())
return 0;
xen_swiotlb_init(1, false);
xen_dma_ops = &xen_swiotlb_dma_ops;
+
+ cflush.op = 0;
+ cflush.a.dev_bus_addr = 0;
+ cflush.offset = 0;
+ cflush.length = 0;
+ if (HYPERVISOR_grant_table_op(GNTTABOP_cache_flush, &cflush, 1) != -ENOSYS)
+ hypercall_cflush = true;
return 0;
}
arch_initcall(xen_mm_init);
diff --git a/include/xen/interface/grant_table.h b/include/xen/interface/grant_table.h
index e40fae9..bcce564 100644
--- a/include/xen/interface/grant_table.h
+++ b/include/xen/interface/grant_table.h
@@ -479,6 +479,25 @@ struct gnttab_get_version {
DEFINE_GUEST_HANDLE_STRUCT(gnttab_get_version);

/*
+ * Issue one or more cache maintenance operations on a portion of a
+ * page granted to the calling domain by a foreign domain.
+ */
+#define GNTTABOP_cache_flush 12
+struct gnttab_cache_flush {
+ union {
+ uint64_t dev_bus_addr;
+ grant_ref_t ref;
+ } a;
+ uint16_t offset; /* offset from start of grant */
+ uint16_t length; /* size within the grant */
+#define GNTTAB_CACHE_CLEAN (1<<0)
+#define GNTTAB_CACHE_INVAL (1<<1)
+#define GNTTAB_CACHE_SOURCE_GREF (1<<31)
+ uint32_t op;
+};
+DEFINE_GUEST_HANDLE_STRUCT(gnttab_cache_flush);
+
+/*
* Bitfield values for update_pin_status.flags.
*/
/* Map the grant entry for access by I/O devices. */
--
1.7.10.4

2014-10-27 15:14:37

by Stefano Stabellini

[permalink] [raw]
Subject: [PATCH v7 4/8] arm: introduce is_device_dma_coherent

Introduce a boolean flag and an accessor function to check whether a
device is dma_coherent. Set the flag from set_arch_dma_coherent_ops.

Signed-off-by: Stefano Stabellini <[email protected]>
Signed-off-by: Catalin Marinas <[email protected]>
CC: [email protected]
CC: [email protected]
---
arch/arm/include/asm/device.h | 1 +
arch/arm/include/asm/dma-mapping.h | 6 ++++++
2 files changed, 7 insertions(+)

diff --git a/arch/arm/include/asm/device.h b/arch/arm/include/asm/device.h
index dc662fc..4111592 100644
--- a/arch/arm/include/asm/device.h
+++ b/arch/arm/include/asm/device.h
@@ -17,6 +17,7 @@ struct dev_archdata {
#ifdef CONFIG_ARM_DMA_USE_IOMMU
struct dma_iommu_mapping *mapping;
#endif
+ bool dma_coherent;
};

struct omap_device;
diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h
index 85738b2..8c3b616 100644
--- a/arch/arm/include/asm/dma-mapping.h
+++ b/arch/arm/include/asm/dma-mapping.h
@@ -123,11 +123,17 @@ static inline unsigned long dma_max_pfn(struct device *dev)

static inline int set_arch_dma_coherent_ops(struct device *dev)
{
+ dev->archdata.dma_coherent = true;
set_dma_ops(dev, &arm_coherent_dma_ops);
return 0;
}
#define set_arch_dma_coherent_ops(dev) set_arch_dma_coherent_ops(dev)

+static inline bool is_device_dma_coherent(struct device *dev)
+{
+ return dev->archdata.dma_coherent;
+}
+
static inline dma_addr_t phys_to_dma(struct device *dev, phys_addr_t paddr)
{
unsigned int offset = paddr & ~PAGE_MASK;
--
1.7.10.4

2014-10-27 15:14:36

by Stefano Stabellini

[permalink] [raw]
Subject: [PATCH v7 5/8] xen/arm: use is_device_dma_coherent

Use is_device_dma_coherent to check whether we need to issue cache
maintenance operations rather than checking on the existence of a
particular dma_ops function for the device.

This is correct because coherent devices don't need cache maintenance
operations - arm_coherent_dma_ops does not set the hooks that we
were previously checking for existance.

Signed-off-by: Stefano Stabellini <[email protected]>
Acked-by: Ian Campbell <[email protected]>
---
arch/arm/xen/mm32.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/arm/xen/mm32.c b/arch/arm/xen/mm32.c
index 6153d61..d8ed359 100644
--- a/arch/arm/xen/mm32.c
+++ b/arch/arm/xen/mm32.c
@@ -76,7 +76,7 @@ void xen_dma_unmap_page(struct device *hwdev, dma_addr_t handle,
struct dma_attrs *attrs)

{
- if (!__generic_dma_ops(hwdev)->unmap_page)
+ if (is_device_dma_coherent(hwdev))
return;
if (dma_get_attr(DMA_ATTR_SKIP_CPU_SYNC, attrs))
return;
@@ -87,7 +87,7 @@ void xen_dma_unmap_page(struct device *hwdev, dma_addr_t handle,
void xen_dma_sync_single_for_cpu(struct device *hwdev,
dma_addr_t handle, size_t size, enum dma_data_direction dir)
{
- if (!__generic_dma_ops(hwdev)->sync_single_for_cpu)
+ if (is_device_dma_coherent(hwdev))
return;
__xen_dma_page_dev_to_cpu(hwdev, handle, size, dir);
}
@@ -95,7 +95,7 @@ void xen_dma_sync_single_for_cpu(struct device *hwdev,
void xen_dma_sync_single_for_device(struct device *hwdev,
dma_addr_t handle, size_t size, enum dma_data_direction dir)
{
- if (!__generic_dma_ops(hwdev)->sync_single_for_device)
+ if (is_device_dma_coherent(hwdev))
return;
__xen_dma_page_cpu_to_dev(hwdev, handle, size, dir);
}
--
1.7.10.4

2014-11-03 10:54:38

by Stefano Stabellini

[permalink] [raw]
Subject: Re: [PATCH v7 7/8] xen/arm/arm64: introduce xen_arch_need_swiotlb

On Mon, 27 Oct 2014, Stefano Stabellini wrote:
> Introduce an arch specific function to find out whether a particular dma
> mapping operation needs to bounce on the swiotlb buffer.
>
> On ARM and ARM64, if the page involved is a foreign page and the device
> is not coherent, we need to bounce because at unmap time we cannot
> execute any required cache maintenance operations (we don't know how to
> find the pfn from the mfn).
>
> No change of behaviour for x86.
>
> Signed-off-by: Stefano Stabellini <[email protected]>
> Acked-by: Ian Campbell <[email protected]>

Konrad, David, are you OK with the swiotlb-xen changes?



> Changes in v6:
> - fix ts.
>
> Changes in v5:
> - fix indentation.
> ---
> arch/arm/include/asm/xen/page.h | 4 ++++
> arch/arm/xen/mm.c | 7 +++++++
> arch/x86/include/asm/xen/page.h | 7 +++++++
> drivers/xen/swiotlb-xen.c | 5 ++++-
> 4 files changed, 22 insertions(+), 1 deletion(-)
>
> diff --git a/arch/arm/include/asm/xen/page.h b/arch/arm/include/asm/xen/page.h
> index 135c24a..68c739b 100644
> --- a/arch/arm/include/asm/xen/page.h
> +++ b/arch/arm/include/asm/xen/page.h
> @@ -107,4 +107,8 @@ static inline bool set_phys_to_machine(unsigned long pfn, unsigned long mfn)
> #define xen_remap(cookie, size) ioremap_cache((cookie), (size))
> #define xen_unmap(cookie) iounmap((cookie))
>
> +bool xen_arch_need_swiotlb(struct device *dev,
> + unsigned long pfn,
> + unsigned long mfn);
> +
> #endif /* _ASM_ARM_XEN_PAGE_H */
> diff --git a/arch/arm/xen/mm.c b/arch/arm/xen/mm.c
> index ff413a8..28396aa 100644
> --- a/arch/arm/xen/mm.c
> +++ b/arch/arm/xen/mm.c
> @@ -139,6 +139,13 @@ void xen_dma_sync_single_for_device(struct device *hwdev,
> __xen_dma_page_cpu_to_dev(hwdev, handle, size, dir);
> }
>
> +bool xen_arch_need_swiotlb(struct device *dev,
> + unsigned long pfn,
> + unsigned long mfn)
> +{
> + return ((pfn != mfn) && !is_device_dma_coherent(dev));
> +}
> +
> int xen_create_contiguous_region(phys_addr_t pstart, unsigned int order,
> unsigned int address_bits,
> dma_addr_t *dma_handle)
> diff --git a/arch/x86/include/asm/xen/page.h b/arch/x86/include/asm/xen/page.h
> index c949923..f58ef6c 100644
> --- a/arch/x86/include/asm/xen/page.h
> +++ b/arch/x86/include/asm/xen/page.h
> @@ -236,4 +236,11 @@ void make_lowmem_page_readwrite(void *vaddr);
> #define xen_remap(cookie, size) ioremap((cookie), (size));
> #define xen_unmap(cookie) iounmap((cookie))
>
> +static inline bool xen_arch_need_swiotlb(struct device *dev,
> + unsigned long pfn,
> + unsigned long mfn)
> +{
> + return false;
> +}
> +
> #endif /* _ASM_X86_XEN_PAGE_H */
> diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
> index ebd8f21..ac5d41b 100644
> --- a/drivers/xen/swiotlb-xen.c
> +++ b/drivers/xen/swiotlb-xen.c
> @@ -399,7 +399,9 @@ dma_addr_t xen_swiotlb_map_page(struct device *dev, struct page *page,
> * buffering it.
> */
> if (dma_capable(dev, dev_addr, size) &&
> - !range_straddles_page_boundary(phys, size) && !swiotlb_force) {
> + !range_straddles_page_boundary(phys, size) &&
> + !xen_arch_need_swiotlb(dev, PFN_DOWN(phys), PFN_DOWN(dev_addr)) &&
> + !swiotlb_force) {
> /* we are not interested in the dma_addr returned by
> * xen_dma_map_page, only in the potential cache flushes executed
> * by the function. */
> @@ -557,6 +559,7 @@ xen_swiotlb_map_sg_attrs(struct device *hwdev, struct scatterlist *sgl,
> dma_addr_t dev_addr = xen_phys_to_bus(paddr);
>
> if (swiotlb_force ||
> + xen_arch_need_swiotlb(hwdev, PFN_DOWN(paddr), PFN_DOWN(dev_addr)) ||
> !dma_capable(hwdev, dev_addr, sg->length) ||
> range_straddles_page_boundary(paddr, sg->length)) {
> phys_addr_t map = swiotlb_tbl_map_single(hwdev,
> --
> 1.7.10.4
>

2014-11-03 10:55:43

by Stefano Stabellini

[permalink] [raw]
Subject: Re: [PATCH v7 4/8] arm: introduce is_device_dma_coherent

On Mon, 27 Oct 2014, Stefano Stabellini wrote:
> Introduce a boolean flag and an accessor function to check whether a
> device is dma_coherent. Set the flag from set_arch_dma_coherent_ops.
>
> Signed-off-by: Stefano Stabellini <[email protected]>
> Signed-off-by: Catalin Marinas <[email protected]>
> CC: [email protected]
> CC: [email protected]

Any opinions on this?
FYI I have just submitted the patch to Russell's patch tracking system.


> arch/arm/include/asm/device.h | 1 +
> arch/arm/include/asm/dma-mapping.h | 6 ++++++
> 2 files changed, 7 insertions(+)
>
> diff --git a/arch/arm/include/asm/device.h b/arch/arm/include/asm/device.h
> index dc662fc..4111592 100644
> --- a/arch/arm/include/asm/device.h
> +++ b/arch/arm/include/asm/device.h
> @@ -17,6 +17,7 @@ struct dev_archdata {
> #ifdef CONFIG_ARM_DMA_USE_IOMMU
> struct dma_iommu_mapping *mapping;
> #endif
> + bool dma_coherent;
> };
>
> struct omap_device;
> diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h
> index 85738b2..8c3b616 100644
> --- a/arch/arm/include/asm/dma-mapping.h
> +++ b/arch/arm/include/asm/dma-mapping.h
> @@ -123,11 +123,17 @@ static inline unsigned long dma_max_pfn(struct device *dev)
>
> static inline int set_arch_dma_coherent_ops(struct device *dev)
> {
> + dev->archdata.dma_coherent = true;
> set_dma_ops(dev, &arm_coherent_dma_ops);
> return 0;
> }
> #define set_arch_dma_coherent_ops(dev) set_arch_dma_coherent_ops(dev)
>
> +static inline bool is_device_dma_coherent(struct device *dev)
> +{
> + return dev->archdata.dma_coherent;
> +}
> +
> static inline dma_addr_t phys_to_dma(struct device *dev, phys_addr_t paddr)
> {
> unsigned int offset = paddr & ~PAGE_MASK;
> --
> 1.7.10.4
>

2014-11-03 10:57:23

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH v7 3/8] arm64: introduce is_device_dma_coherent

On Mon, Nov 03, 2014 at 10:46:03AM +0000, Stefano Stabellini wrote:
> On Mon, 27 Oct 2014, Stefano Stabellini wrote:
> > Introduce a boolean flag and an accessor function to check whether a
> > device is dma_coherent. Set the flag from set_arch_dma_coherent_ops.
> >
> > Signed-off-by: Stefano Stabellini <[email protected]>
> > Signed-off-by: Catalin Marinas <[email protected]>
> > CC: [email protected]
>
> Will, Catalin,
> are you OK with this patch?

It would be nicer if the dma_coherent flag didn't have to be duplicated by
each architecture in dev_archdata. Is there any reason not to put it in the
core code?

Will

2014-11-03 10:54:35

by Stefano Stabellini

[permalink] [raw]
Subject: Re: [PATCH v7 3/8] arm64: introduce is_device_dma_coherent

On Mon, 27 Oct 2014, Stefano Stabellini wrote:
> Introduce a boolean flag and an accessor function to check whether a
> device is dma_coherent. Set the flag from set_arch_dma_coherent_ops.
>
> Signed-off-by: Stefano Stabellini <[email protected]>
> Signed-off-by: Catalin Marinas <[email protected]>
> CC: [email protected]

Will, Catalin,
are you OK with this patch?


> arch/arm64/include/asm/device.h | 1 +
> arch/arm64/include/asm/dma-mapping.h | 6 ++++++
> 2 files changed, 7 insertions(+)
>
> diff --git a/arch/arm64/include/asm/device.h b/arch/arm64/include/asm/device.h
> index cf98b36..243ef25 100644
> --- a/arch/arm64/include/asm/device.h
> +++ b/arch/arm64/include/asm/device.h
> @@ -21,6 +21,7 @@ struct dev_archdata {
> #ifdef CONFIG_IOMMU_API
> void *iommu; /* private IOMMU data */
> #endif
> + bool dma_coherent;
> };
>
> struct pdev_archdata {
> diff --git a/arch/arm64/include/asm/dma-mapping.h b/arch/arm64/include/asm/dma-mapping.h
> index adeae3f..e213dc4 100644
> --- a/arch/arm64/include/asm/dma-mapping.h
> +++ b/arch/arm64/include/asm/dma-mapping.h
> @@ -54,11 +54,17 @@ static inline void set_dma_ops(struct device *dev, struct dma_map_ops *ops)
>
> static inline int set_arch_dma_coherent_ops(struct device *dev)
> {
> + dev->archdata.dma_coherent = true;
> set_dma_ops(dev, &coherent_swiotlb_dma_ops);
> return 0;
> }
> #define set_arch_dma_coherent_ops set_arch_dma_coherent_ops
>
> +static inline bool is_device_dma_coherent(struct device *dev)
> +{
> + return dev->archdata.dma_coherent;
> +}
> +
> #include <asm-generic/dma-mapping-common.h>
>
> static inline dma_addr_t phys_to_dma(struct device *dev, phys_addr_t paddr)
> --
> 1.7.10.4
>

2014-11-03 11:01:55

by David Vrabel

[permalink] [raw]
Subject: Re: [Xen-devel] [PATCH v7 7/8] xen/arm/arm64: introduce xen_arch_need_swiotlb

On 03/11/14 10:45, Stefano Stabellini wrote:
> On Mon, 27 Oct 2014, Stefano Stabellini wrote:
>> Introduce an arch specific function to find out whether a particular dma
>> mapping operation needs to bounce on the swiotlb buffer.
>>
>> On ARM and ARM64, if the page involved is a foreign page and the device
>> is not coherent, we need to bounce because at unmap time we cannot
>> execute any required cache maintenance operations (we don't know how to
>> find the pfn from the mfn).
>>
>> No change of behaviour for x86.
>>
>> Signed-off-by: Stefano Stabellini <[email protected]>
>> Acked-by: Ian Campbell <[email protected]>
>
> Konrad, David, are you OK with the swiotlb-xen changes?

Reviewed-by: David Vrabel <[email protected]>

But swiotlb is Konrad's responsibility.

David

2014-11-03 11:28:25

by Stefano Stabellini

[permalink] [raw]
Subject: Re: [PATCH v7 3/8] arm64: introduce is_device_dma_coherent

On Mon, 3 Nov 2014, Will Deacon wrote:
> On Mon, Nov 03, 2014 at 10:46:03AM +0000, Stefano Stabellini wrote:
> > On Mon, 27 Oct 2014, Stefano Stabellini wrote:
> > > Introduce a boolean flag and an accessor function to check whether a
> > > device is dma_coherent. Set the flag from set_arch_dma_coherent_ops.
> > >
> > > Signed-off-by: Stefano Stabellini <[email protected]>
> > > Signed-off-by: Catalin Marinas <[email protected]>
> > > CC: [email protected]
> >
> > Will, Catalin,
> > are you OK with this patch?
>
> It would be nicer if the dma_coherent flag didn't have to be duplicated by
> each architecture in dev_archdata. Is there any reason not to put it in the
> core code?

Yes, there is a reason for it: if I added a boolean dma_coherent flag in
struct device as Catalin initially suggested, what would be the default
for each architecture? Where would I set it for arch that don't use
device tree? It is not easy.

I thought it would be better to introduce is_device_dma_coherent only on
the architectures where it certainly makes sense to have it. In fact I
checked and arm and arm64 are the only architectures to define
set_arch_dma_coherent_ops at the moment. At that point if
is_device_dma_coherent becomes arch-specific, it makes sense to store
the flag in dev_archdata instead of struct device.

2014-11-03 16:05:13

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: Re: [PATCH v7 7/8] xen/arm/arm64: introduce xen_arch_need_swiotlb

On Mon, Nov 03, 2014 at 10:45:14AM +0000, Stefano Stabellini wrote:
> On Mon, 27 Oct 2014, Stefano Stabellini wrote:
> > Introduce an arch specific function to find out whether a particular dma
> > mapping operation needs to bounce on the swiotlb buffer.
> >
> > On ARM and ARM64, if the page involved is a foreign page and the device
> > is not coherent, we need to bounce because at unmap time we cannot
> > execute any required cache maintenance operations (we don't know how to
> > find the pfn from the mfn).
> >
> > No change of behaviour for x86.
> >
> > Signed-off-by: Stefano Stabellini <[email protected]>
> > Acked-by: Ian Campbell <[email protected]>
>
> Konrad, David, are you OK with the swiotlb-xen changes?

I am OK.
>
>
>
> > Changes in v6:
> > - fix ts.
> >
> > Changes in v5:
> > - fix indentation.
> > ---
> > arch/arm/include/asm/xen/page.h | 4 ++++
> > arch/arm/xen/mm.c | 7 +++++++
> > arch/x86/include/asm/xen/page.h | 7 +++++++
> > drivers/xen/swiotlb-xen.c | 5 ++++-
> > 4 files changed, 22 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/arm/include/asm/xen/page.h b/arch/arm/include/asm/xen/page.h
> > index 135c24a..68c739b 100644
> > --- a/arch/arm/include/asm/xen/page.h
> > +++ b/arch/arm/include/asm/xen/page.h
> > @@ -107,4 +107,8 @@ static inline bool set_phys_to_machine(unsigned long pfn, unsigned long mfn)
> > #define xen_remap(cookie, size) ioremap_cache((cookie), (size))
> > #define xen_unmap(cookie) iounmap((cookie))
> >
> > +bool xen_arch_need_swiotlb(struct device *dev,
> > + unsigned long pfn,
> > + unsigned long mfn);
> > +
> > #endif /* _ASM_ARM_XEN_PAGE_H */
> > diff --git a/arch/arm/xen/mm.c b/arch/arm/xen/mm.c
> > index ff413a8..28396aa 100644
> > --- a/arch/arm/xen/mm.c
> > +++ b/arch/arm/xen/mm.c
> > @@ -139,6 +139,13 @@ void xen_dma_sync_single_for_device(struct device *hwdev,
> > __xen_dma_page_cpu_to_dev(hwdev, handle, size, dir);
> > }
> >
> > +bool xen_arch_need_swiotlb(struct device *dev,
> > + unsigned long pfn,
> > + unsigned long mfn)
> > +{
> > + return ((pfn != mfn) && !is_device_dma_coherent(dev));
> > +}
> > +
> > int xen_create_contiguous_region(phys_addr_t pstart, unsigned int order,
> > unsigned int address_bits,
> > dma_addr_t *dma_handle)
> > diff --git a/arch/x86/include/asm/xen/page.h b/arch/x86/include/asm/xen/page.h
> > index c949923..f58ef6c 100644
> > --- a/arch/x86/include/asm/xen/page.h
> > +++ b/arch/x86/include/asm/xen/page.h
> > @@ -236,4 +236,11 @@ void make_lowmem_page_readwrite(void *vaddr);
> > #define xen_remap(cookie, size) ioremap((cookie), (size));
> > #define xen_unmap(cookie) iounmap((cookie))
> >
> > +static inline bool xen_arch_need_swiotlb(struct device *dev,
> > + unsigned long pfn,
> > + unsigned long mfn)
> > +{
> > + return false;
> > +}
> > +
> > #endif /* _ASM_X86_XEN_PAGE_H */
> > diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
> > index ebd8f21..ac5d41b 100644
> > --- a/drivers/xen/swiotlb-xen.c
> > +++ b/drivers/xen/swiotlb-xen.c
> > @@ -399,7 +399,9 @@ dma_addr_t xen_swiotlb_map_page(struct device *dev, struct page *page,
> > * buffering it.
> > */
> > if (dma_capable(dev, dev_addr, size) &&
> > - !range_straddles_page_boundary(phys, size) && !swiotlb_force) {
> > + !range_straddles_page_boundary(phys, size) &&
> > + !xen_arch_need_swiotlb(dev, PFN_DOWN(phys), PFN_DOWN(dev_addr)) &&
> > + !swiotlb_force) {
> > /* we are not interested in the dma_addr returned by
> > * xen_dma_map_page, only in the potential cache flushes executed
> > * by the function. */
> > @@ -557,6 +559,7 @@ xen_swiotlb_map_sg_attrs(struct device *hwdev, struct scatterlist *sgl,
> > dma_addr_t dev_addr = xen_phys_to_bus(paddr);
> >
> > if (swiotlb_force ||
> > + xen_arch_need_swiotlb(hwdev, PFN_DOWN(paddr), PFN_DOWN(dev_addr)) ||
> > !dma_capable(hwdev, dev_addr, sg->length) ||
> > range_straddles_page_boundary(paddr, sg->length)) {
> > phys_addr_t map = swiotlb_tbl_map_single(hwdev,
> > --
> > 1.7.10.4
> >

2014-11-04 11:36:14

by Grygorii Strashko

[permalink] [raw]
Subject: Re: [PATCH v7 3/8] arm64: introduce is_device_dma_coherent

Hi Stefano,

On 11/03/2014 01:10 PM, Stefano Stabellini wrote:
> On Mon, 3 Nov 2014, Will Deacon wrote:
>> On Mon, Nov 03, 2014 at 10:46:03AM +0000, Stefano Stabellini wrote:
>>> On Mon, 27 Oct 2014, Stefano Stabellini wrote:
>>>> Introduce a boolean flag and an accessor function to check whether a
>>>> device is dma_coherent. Set the flag from set_arch_dma_coherent_ops.
>>>>
>>>> Signed-off-by: Stefano Stabellini <[email protected]>
>>>> Signed-off-by: Catalin Marinas <[email protected]>
>>>> CC: [email protected]
>>>
>>> Will, Catalin,
>>> are you OK with this patch?
>>
>> It would be nicer if the dma_coherent flag didn't have to be duplicated by
>> each architecture in dev_archdata. Is there any reason not to put it in the
>> core code?
>
> Yes, there is a reason for it: if I added a boolean dma_coherent flag in
> struct device as Catalin initially suggested, what would be the default
> for each architecture? Where would I set it for arch that don't use
> device tree? It is not easy.
>
> I thought it would be better to introduce is_device_dma_coherent only on
> the architectures where it certainly makes sense to have it. In fact I
> checked and arm and arm64 are the only architectures to define
> set_arch_dma_coherent_ops at the moment. At that point if
> is_device_dma_coherent becomes arch-specific, it makes sense to store
> the flag in dev_archdata instead of struct device.

The proposition from Will looks reasonable for me too, because
there is "small" side-effect of adding such kind of properties to
arch-specific data or even to the core device structure. ;(

There are some sub-systems in kernel which do not create their devices
from DT and instead some host device populates its children devices manually.
Now, I know at least two cases:
- usb: dwc3 core creates xhci device manually
- pci: adds its client devices

In such, case DMA configuration have to be propagated from host to
child (in our case host device's got DMA configuration from DT), like:
dma_set_coherent_mask(&xhci->dev, dwc->dev->coherent_dma_mask);

xhci->dev.parent = dwc->dev;
xhci->dev.dma_mask = dwc->dev->dma_mask;
xhci->dev.dma_parms = dwc->dev->dma_parms;

So, once new DMA property is added it has to be propagated from
host to child device too.

Recently, the new property dma_pfn_offset was introduced in struct device
and such kind of problem was observed on keystone 2:
- for usb case it was fixed using Platform Bus notifier (xhci - platform device)
- for pci - the work is in progress, because solution with PCI Bus notifier
was rejected https://lkml.org/lkml/2014/10/10/308.

In general, if dma_coherent will belong to struct device then
such problems will be possible to fix directly in drivers/subsystems:
xhci->dev.dma_coherent = dwc->dev->dma_coherent;

But, if it will be arch-specific data then it will be impossible to
set it without introducing proper and arch-specific setters/getters functions.

Also, as an idea, we are thinking about introducing something like:
void dma_apply_parent_cfg(struct device *dev, struct device *parent)
which will ensure that all DMA configuration properly copied from
parent to children device. Now it should be (as minimum for ARM):
dma_mask
coherent_dma_mask
dma_parms
dma_pfn_offset
dev_archdata->dma_ops
[dma_coherent]?

regards,
-grygorii

2014-11-04 15:00:51

by Stefano Stabellini

[permalink] [raw]
Subject: Re: [PATCH v7 3/8] arm64: introduce is_device_dma_coherent

On Tue, 4 Nov 2014, Grygorii Strashko wrote:
> Hi Stefano,
>
> On 11/03/2014 01:10 PM, Stefano Stabellini wrote:
> > On Mon, 3 Nov 2014, Will Deacon wrote:
> >> On Mon, Nov 03, 2014 at 10:46:03AM +0000, Stefano Stabellini wrote:
> >>> On Mon, 27 Oct 2014, Stefano Stabellini wrote:
> >>>> Introduce a boolean flag and an accessor function to check whether a
> >>>> device is dma_coherent. Set the flag from set_arch_dma_coherent_ops.
> >>>>
> >>>> Signed-off-by: Stefano Stabellini <[email protected]>
> >>>> Signed-off-by: Catalin Marinas <[email protected]>
> >>>> CC: [email protected]
> >>>
> >>> Will, Catalin,
> >>> are you OK with this patch?
> >>
> >> It would be nicer if the dma_coherent flag didn't have to be duplicated by
> >> each architecture in dev_archdata. Is there any reason not to put it in the
> >> core code?
> >
> > Yes, there is a reason for it: if I added a boolean dma_coherent flag in
> > struct device as Catalin initially suggested, what would be the default
> > for each architecture? Where would I set it for arch that don't use
> > device tree? It is not easy.
> >
> > I thought it would be better to introduce is_device_dma_coherent only on
> > the architectures where it certainly makes sense to have it. In fact I
> > checked and arm and arm64 are the only architectures to define
> > set_arch_dma_coherent_ops at the moment. At that point if
> > is_device_dma_coherent becomes arch-specific, it makes sense to store
> > the flag in dev_archdata instead of struct device.
>
> The proposition from Will looks reasonable for me too, because
> there is "small" side-effect of adding such kind of properties to
> arch-specific data or even to the core device structure. ;(
>
> There are some sub-systems in kernel which do not create their devices
> from DT and instead some host device populates its children devices manually.
> Now, I know at least two cases:
> - usb: dwc3 core creates xhci device manually
> - pci: adds its client devices
>
> In such, case DMA configuration have to be propagated from host to
> child (in our case host device's got DMA configuration from DT), like:
> dma_set_coherent_mask(&xhci->dev, dwc->dev->coherent_dma_mask);
>
> xhci->dev.parent = dwc->dev;
> xhci->dev.dma_mask = dwc->dev->dma_mask;
> xhci->dev.dma_parms = dwc->dev->dma_parms;
>
> So, once new DMA property is added it has to be propagated from
> host to child device too.
>
> Recently, the new property dma_pfn_offset was introduced in struct device
> and such kind of problem was observed on keystone 2:
> - for usb case it was fixed using Platform Bus notifier (xhci - platform device)
> - for pci - the work is in progress, because solution with PCI Bus notifier
> was rejected https://lkml.org/lkml/2014/10/10/308.
>
> In general, if dma_coherent will belong to struct device then
> such problems will be possible to fix directly in drivers/subsystems:
> xhci->dev.dma_coherent = dwc->dev->dma_coherent;
>
> But, if it will be arch-specific data then it will be impossible to
> set it without introducing proper and arch-specific setters/getters functions.
>
> Also, as an idea, we are thinking about introducing something like:
> void dma_apply_parent_cfg(struct device *dev, struct device *parent)
> which will ensure that all DMA configuration properly copied from
> parent to children device. Now it should be (as minimum for ARM):
> dma_mask
> coherent_dma_mask
> dma_parms
> dma_pfn_offset
> dev_archdata->dma_ops
> [dma_coherent]?

I understand your concern but the problem you have goes far beyond a
simple dma_coherent flag: what about all the other dev_archdata fields?
Aside from dma_ops, on some other architectures there might be other
data structrures in dev_archdata that need to be properly initialized
from the parent.

Your idea of introducing something like dma_apply_parent_cfg is the only
solid solution I can see. However I would consider naming it something
more generic like init_dev_from_parent to handle other possible
configurations (inside or outside dev_archdata) that might have to be
initialized from information on the parent device.


Regarding the dma_coherent flag, I still prefer this approach because
introducing the dma_coherent flag in dev_archdata wouldn't make this
issue any worse than already is, but would avoid other problems as
mentioned in my previous reply.

2014-11-05 16:57:00

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH v7 3/8] arm64: introduce is_device_dma_coherent

On Mon, Nov 03, 2014 at 11:10:18AM +0000, Stefano Stabellini wrote:
> On Mon, 3 Nov 2014, Will Deacon wrote:
> > On Mon, Nov 03, 2014 at 10:46:03AM +0000, Stefano Stabellini wrote:
> > > On Mon, 27 Oct 2014, Stefano Stabellini wrote:
> > > > Introduce a boolean flag and an accessor function to check whether a
> > > > device is dma_coherent. Set the flag from set_arch_dma_coherent_ops.
> > > >
> > > > Signed-off-by: Stefano Stabellini <[email protected]>
> > > > Signed-off-by: Catalin Marinas <[email protected]>
> > > > CC: [email protected]
> > >
> > > Will, Catalin,
> > > are you OK with this patch?
> >
> > It would be nicer if the dma_coherent flag didn't have to be duplicated by
> > each architecture in dev_archdata. Is there any reason not to put it in the
> > core code?
>
> Yes, there is a reason for it: if I added a boolean dma_coherent flag in
> struct device as Catalin initially suggested, what would be the default
> for each architecture? Where would I set it for arch that don't use
> device tree?

You don't need to. An architecture that has coherent DMA always doesn't
need to do anything. One that has non-coherent DMA always only needs to
select HAVE_DMA_NONCOHERENT. One that has a mix of both needs to find a
way to set dev->dma_coherent. Since that's a new API you introduce, it
doesn't break any existing architectures.

Note that if !is_device_dma_coherent(), it doesn't always mean that
standard cache maintenance would be enough (but that's a Xen problem,
not sure how to solve).


diff --git a/arch/Kconfig b/arch/Kconfig
index 05d7a8a458d5..8462b2e7491b 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -203,6 +203,9 @@ config HAVE_DMA_ATTRS
config HAVE_DMA_CONTIGUOUS
bool

+config HAVE_DMA_NONCOHERENT
+ bool
+
config GENERIC_SMP_IDLE_THREAD
bool

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 89c4b5ccc68d..fd7d5522764c 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -40,6 +40,7 @@ config ARM
select HAVE_DMA_API_DEBUG
select HAVE_DMA_ATTRS
select HAVE_DMA_CONTIGUOUS if MMU
+ select HAVE_DMA_NONCOHERENT if OF
select HAVE_DYNAMIC_FTRACE if (!XIP_KERNEL)
select HAVE_EFFICIENT_UNALIGNED_ACCESS if (CPU_V6 || CPU_V6K || CPU_V7) && MMU
select HAVE_FTRACE_MCOUNT_RECORD if (!XIP_KERNEL)
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 9532f8d5857e..eb7a5aa64e0e 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -46,6 +46,7 @@ config ARM64
select HAVE_DMA_API_DEBUG
select HAVE_DMA_ATTRS
select HAVE_DMA_CONTIGUOUS
+ select HAVE_DMA_NONCOHERENT
select HAVE_DYNAMIC_FTRACE
select HAVE_EFFICIENT_UNALIGNED_ACCESS
select HAVE_FTRACE_MCOUNT_RECORD
diff --git a/drivers/of/platform.c b/drivers/of/platform.c
index 3b64d0bf5bba..7e827726b702 100644
--- a/drivers/of/platform.c
+++ b/drivers/of/platform.c
@@ -183,6 +183,7 @@ static void of_dma_configure(struct device *dev)
* dma coherent operations.
*/
if (of_dma_is_coherent(dev->of_node)) {
+ dev->dma_coherent = true;
set_arch_dma_coherent_ops(dev);
dev_dbg(dev, "device is dma coherent\n");
}
diff --git a/include/linux/device.h b/include/linux/device.h
index ce1f21608b16..e00ca876db01 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -796,6 +796,7 @@ struct device {

bool offline_disabled:1;
bool offline:1;
+ bool dma_coherent:1;
};

static inline struct device *kobj_to_dev(struct kobject *kobj)
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index d5d388160f42..0bdffba2337d 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -78,6 +78,18 @@ static inline int is_device_dma_capable(struct device *dev)
return dev->dma_mask != NULL && *dev->dma_mask != DMA_MASK_NONE;
}

+#ifdef CONFIG_HAVE_DMA_NONCOHERENT
+static inline int is_device_dma_coherent(struct device *dev)
+{
+ return dev->dma_coherent;
+}
+#else
+static inline int is_device_dma_coherent(struct device *dev)
+{
+ return 1
+}
+#endif
+
#ifdef CONFIG_HAS_DMA
#include <asm/dma-mapping.h>
#else
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 9532f8d5857e..eb7a5aa64e0e 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -46,6 +46,7 @@ config ARM64
select HAVE_DMA_API_DEBUG
select HAVE_DMA_ATTRS
select HAVE_DMA_CONTIGUOUS
+ select HAVE_DMA_NONCOHERENT
select HAVE_DYNAMIC_FTRACE
select HAVE_EFFICIENT_UNALIGNED_ACCESS
select HAVE_FTRACE_MCOUNT_RECORD
diff --git a/drivers/of/platform.c b/drivers/of/platform.c
index 3b64d0bf5bba..7e827726b702 100644
--- a/drivers/of/platform.c
+++ b/drivers/of/platform.c
@@ -183,6 +183,7 @@ static void of_dma_configure(struct device *dev)
* dma coherent operations.
*/
if (of_dma_is_coherent(dev->of_node)) {
+ dev->dma_coherent = true;
set_arch_dma_coherent_ops(dev);
dev_dbg(dev, "device is dma coherent\n");
}
diff --git a/include/linux/device.h b/include/linux/device.h
index ce1f21608b16..e00ca876db01 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -796,6 +796,7 @@ struct device {

bool offline_disabled:1;
bool offline:1;
+ bool dma_coherent:1;
};

static inline struct device *kobj_to_dev(struct kobject *kobj)
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index d5d388160f42..0bdffba2337d 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -78,6 +78,18 @@ static inline int is_device_dma_capable(struct device *dev)
return dev->dma_mask != NULL && *dev->dma_mask != DMA_MASK_NONE;
}

+#ifdef CONFIG_HAVE_DMA_NONCOHERENT
+static inline int is_device_dma_coherent(struct device *dev)
+{
+ return dev->dma_coherent;
+}
+#else
+static inline int is_device_dma_coherent(struct device *dev)
+{
+ return 1
+}
+#endif
+
#ifdef CONFIG_HAS_DMA
#include <asm/dma-mapping.h>
#else

2014-11-05 18:15:54

by Stefano Stabellini

[permalink] [raw]
Subject: Re: [PATCH v7 3/8] arm64: introduce is_device_dma_coherent

On Wed, 5 Nov 2014, Catalin Marinas wrote:
> On Mon, Nov 03, 2014 at 11:10:18AM +0000, Stefano Stabellini wrote:
> > On Mon, 3 Nov 2014, Will Deacon wrote:
> > > On Mon, Nov 03, 2014 at 10:46:03AM +0000, Stefano Stabellini wrote:
> > > > On Mon, 27 Oct 2014, Stefano Stabellini wrote:
> > > > > Introduce a boolean flag and an accessor function to check whether a
> > > > > device is dma_coherent. Set the flag from set_arch_dma_coherent_ops.
> > > > >
> > > > > Signed-off-by: Stefano Stabellini <[email protected]>
> > > > > Signed-off-by: Catalin Marinas <[email protected]>
> > > > > CC: [email protected]
> > > >
> > > > Will, Catalin,
> > > > are you OK with this patch?
> > >
> > > It would be nicer if the dma_coherent flag didn't have to be duplicated by
> > > each architecture in dev_archdata. Is there any reason not to put it in the
> > > core code?
> >
> > Yes, there is a reason for it: if I added a boolean dma_coherent flag in
> > struct device as Catalin initially suggested, what would be the default
> > for each architecture? Where would I set it for arch that don't use
> > device tree?
>
> You don't need to. An architecture that has coherent DMA always doesn't
> need to do anything. One that has non-coherent DMA always only needs to
> select HAVE_DMA_NONCOHERENT. One that has a mix of both needs to find a
> way to set dev->dma_coherent. Since that's a new API you introduce, it
> doesn't break any existing architectures.

I am not sure that this is better than the current patch but I can see
that this approach is not too controversial, so I am happy to go with
whatever the maintainers prefer.


> Note that if !is_device_dma_coherent(), it doesn't always mean that
> standard cache maintenance would be enough (but that's a Xen problem,
> not sure how to solve).

It is a thorny issue indeed.
Xen would need to know how to do non-standard cache maintenance
operations.
Otherwise we would need to resurrect XENFEAT_grant_map_identity (that I
am reverting in this series) and be content with having CONFIG_XEN_DOM0
depend on CONFIG_ARM_LPAE.


> diff --git a/arch/Kconfig b/arch/Kconfig
> index 05d7a8a458d5..8462b2e7491b 100644
> --- a/arch/Kconfig
> +++ b/arch/Kconfig
> @@ -203,6 +203,9 @@ config HAVE_DMA_ATTRS
> config HAVE_DMA_CONTIGUOUS
> bool
>
> +config HAVE_DMA_NONCOHERENT
> + bool
> +
> config GENERIC_SMP_IDLE_THREAD
> bool
>
> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> index 89c4b5ccc68d..fd7d5522764c 100644
> --- a/arch/arm/Kconfig
> +++ b/arch/arm/Kconfig
> @@ -40,6 +40,7 @@ config ARM
> select HAVE_DMA_API_DEBUG
> select HAVE_DMA_ATTRS
> select HAVE_DMA_CONTIGUOUS if MMU
> + select HAVE_DMA_NONCOHERENT if OF
> select HAVE_DYNAMIC_FTRACE if (!XIP_KERNEL)
> select HAVE_EFFICIENT_UNALIGNED_ACCESS if (CPU_V6 || CPU_V6K || CPU_V7) && MMU
> select HAVE_FTRACE_MCOUNT_RECORD if (!XIP_KERNEL)
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 9532f8d5857e..eb7a5aa64e0e 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -46,6 +46,7 @@ config ARM64
> select HAVE_DMA_API_DEBUG
> select HAVE_DMA_ATTRS
> select HAVE_DMA_CONTIGUOUS
> + select HAVE_DMA_NONCOHERENT
> select HAVE_DYNAMIC_FTRACE
> select HAVE_EFFICIENT_UNALIGNED_ACCESS
> select HAVE_FTRACE_MCOUNT_RECORD
> diff --git a/drivers/of/platform.c b/drivers/of/platform.c
> index 3b64d0bf5bba..7e827726b702 100644
> --- a/drivers/of/platform.c
> +++ b/drivers/of/platform.c
> @@ -183,6 +183,7 @@ static void of_dma_configure(struct device *dev)
> * dma coherent operations.
> */
> if (of_dma_is_coherent(dev->of_node)) {
> + dev->dma_coherent = true;
> set_arch_dma_coherent_ops(dev);
> dev_dbg(dev, "device is dma coherent\n");
> }

I think that this would need to be #ifdef'ed as it is possible to have
OF support but no HAVE_DMA_NONCOHERENT (PPC?).


> diff --git a/include/linux/device.h b/include/linux/device.h
> index ce1f21608b16..e00ca876db01 100644
> --- a/include/linux/device.h
> +++ b/include/linux/device.h
> @@ -796,6 +796,7 @@ struct device {
>
> bool offline_disabled:1;
> bool offline:1;
> + bool dma_coherent:1;
> };

I guess we would have to #ifdef CONFIG_HAVE_DMA_NONCOHERENT the
dma_coherent flag, right? Otherwise architecures that do not select
CONFIG_HAVE_DMA_NONCOHERENT (x86 for example) would end up with a flag
in struct device that doesn't reflect the properties of the device (dma
coherent devices with dev->dma_coherent == 0).

Overall it is a lot of ifdefs for not so much code sharing.


> static inline struct device *kobj_to_dev(struct kobject *kobj)
> diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
> index d5d388160f42..0bdffba2337d 100644
> --- a/include/linux/dma-mapping.h
> +++ b/include/linux/dma-mapping.h
> @@ -78,6 +78,18 @@ static inline int is_device_dma_capable(struct device *dev)
> return dev->dma_mask != NULL && *dev->dma_mask != DMA_MASK_NONE;
> }
>
> +#ifdef CONFIG_HAVE_DMA_NONCOHERENT
> +static inline int is_device_dma_coherent(struct device *dev)
> +{
> + return dev->dma_coherent;
> +}
> +#else
> +static inline int is_device_dma_coherent(struct device *dev)
> +{
> + return 1
> +}
> +#endif
> +
> #ifdef CONFIG_HAS_DMA
> #include <asm/dma-mapping.h>
> #else

2014-11-06 10:33:50

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH v7 3/8] arm64: introduce is_device_dma_coherent

On Wed, Nov 05, 2014 at 06:15:38PM +0000, Stefano Stabellini wrote:
> On Wed, 5 Nov 2014, Catalin Marinas wrote:
> > On Mon, Nov 03, 2014 at 11:10:18AM +0000, Stefano Stabellini wrote:
> > > On Mon, 3 Nov 2014, Will Deacon wrote:
> > > > On Mon, Nov 03, 2014 at 10:46:03AM +0000, Stefano Stabellini wrote:
> > > > > On Mon, 27 Oct 2014, Stefano Stabellini wrote:
> > > > > > Introduce a boolean flag and an accessor function to check whether a
> > > > > > device is dma_coherent. Set the flag from set_arch_dma_coherent_ops.
> > > > > >
> > > > > > Signed-off-by: Stefano Stabellini <[email protected]>
> > > > > > Signed-off-by: Catalin Marinas <[email protected]>
> > > > > > CC: [email protected]
> > > > >
> > > > > Will, Catalin,
> > > > > are you OK with this patch?
> > > >
> > > > It would be nicer if the dma_coherent flag didn't have to be duplicated by
> > > > each architecture in dev_archdata. Is there any reason not to put it in the
> > > > core code?
> > >
> > > Yes, there is a reason for it: if I added a boolean dma_coherent flag in
> > > struct device as Catalin initially suggested, what would be the default
> > > for each architecture? Where would I set it for arch that don't use
> > > device tree?
> >
> > You don't need to. An architecture that has coherent DMA always doesn't
> > need to do anything. One that has non-coherent DMA always only needs to
> > select HAVE_DMA_NONCOHERENT. One that has a mix of both needs to find a
> > way to set dev->dma_coherent. Since that's a new API you introduce, it
> > doesn't break any existing architectures.
>
> I am not sure that this is better than the current patch but I can see
> that this approach is not too controversial, so I am happy to go with
> whatever the maintainers prefer.

Functionally it is the same, but just less code duplication.

> > Note that if !is_device_dma_coherent(), it doesn't always mean that
> > standard cache maintenance would be enough (but that's a Xen problem,
> > not sure how to solve).
>
> It is a thorny issue indeed.
> Xen would need to know how to do non-standard cache maintenance
> operations.

Is EL2 hyp or EL1 dom0 doing such maintenance? If the latter, you could
just use the currently registered dma ops.

> Otherwise we would need to resurrect XENFEAT_grant_map_identity (that I
> am reverting in this series) and be content with having CONFIG_XEN_DOM0
> depend on CONFIG_ARM_LPAE.

So what does buy you? Is it just the hope that with LPAE you won't have
weird system caches?

> > diff --git a/arch/Kconfig b/arch/Kconfig
> > index 05d7a8a458d5..8462b2e7491b 100644
> > --- a/arch/Kconfig
> > +++ b/arch/Kconfig
> > @@ -203,6 +203,9 @@ config HAVE_DMA_ATTRS
> > config HAVE_DMA_CONTIGUOUS
> > bool
> >
> > +config HAVE_DMA_NONCOHERENT
> > + bool
> > +
> > config GENERIC_SMP_IDLE_THREAD
> > bool
> >
> > diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> > index 89c4b5ccc68d..fd7d5522764c 100644
> > --- a/arch/arm/Kconfig
> > +++ b/arch/arm/Kconfig
> > @@ -40,6 +40,7 @@ config ARM
> > select HAVE_DMA_API_DEBUG
> > select HAVE_DMA_ATTRS
> > select HAVE_DMA_CONTIGUOUS if MMU
> > + select HAVE_DMA_NONCOHERENT if OF
> > select HAVE_DYNAMIC_FTRACE if (!XIP_KERNEL)
> > select HAVE_EFFICIENT_UNALIGNED_ACCESS if (CPU_V6 || CPU_V6K || CPU_V7) && MMU
> > select HAVE_FTRACE_MCOUNT_RECORD if (!XIP_KERNEL)
> > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> > index 9532f8d5857e..eb7a5aa64e0e 100644
> > --- a/arch/arm64/Kconfig
> > +++ b/arch/arm64/Kconfig
> > @@ -46,6 +46,7 @@ config ARM64
> > select HAVE_DMA_API_DEBUG
> > select HAVE_DMA_ATTRS
> > select HAVE_DMA_CONTIGUOUS
> > + select HAVE_DMA_NONCOHERENT
> > select HAVE_DYNAMIC_FTRACE
> > select HAVE_EFFICIENT_UNALIGNED_ACCESS
> > select HAVE_FTRACE_MCOUNT_RECORD
> > diff --git a/drivers/of/platform.c b/drivers/of/platform.c
> > index 3b64d0bf5bba..7e827726b702 100644
> > --- a/drivers/of/platform.c
> > +++ b/drivers/of/platform.c
> > @@ -183,6 +183,7 @@ static void of_dma_configure(struct device *dev)
> > * dma coherent operations.
> > */
> > if (of_dma_is_coherent(dev->of_node)) {
> > + dev->dma_coherent = true;
> > set_arch_dma_coherent_ops(dev);
> > dev_dbg(dev, "device is dma coherent\n");
> > }
>
> I think that this would need to be #ifdef'ed as it is possible to have
> OF support but no HAVE_DMA_NONCOHERENT (PPC?).

The field is always there. But with !HAVE_DMA_NONCOHERENT,
is_device_dma_coherent() would always return 1. You could avoid
defining is_device_dma_coherent() entirely when !HAVE_DMA_NONCOHERENT,
it wouldn't be worse than your patch in terms of an undefined function.

> > diff --git a/include/linux/device.h b/include/linux/device.h
> > index ce1f21608b16..e00ca876db01 100644
> > --- a/include/linux/device.h
> > +++ b/include/linux/device.h
> > @@ -796,6 +796,7 @@ struct device {
> >
> > bool offline_disabled:1;
> > bool offline:1;
> > + bool dma_coherent:1;
> > };
>
> I guess we would have to #ifdef CONFIG_HAVE_DMA_NONCOHERENT the
> dma_coherent flag, right? Otherwise architecures that do not select
> CONFIG_HAVE_DMA_NONCOHERENT (x86 for example) would end up with a flag
> in struct device that doesn't reflect the properties of the device (dma
> coherent devices with dev->dma_coherent == 0).

In my proposal you should not read this field directly but rather access
it only via is_device_dma_coherent() (you can add a function for setting
it as well).

--
Catalin

2014-11-06 12:22:26

by Stefano Stabellini

[permalink] [raw]
Subject: Re: [PATCH v7 3/8] arm64: introduce is_device_dma_coherent

On Thu, 6 Nov 2014, Catalin Marinas wrote:
> On Wed, Nov 05, 2014 at 06:15:38PM +0000, Stefano Stabellini wrote:
> > On Wed, 5 Nov 2014, Catalin Marinas wrote:
> > > On Mon, Nov 03, 2014 at 11:10:18AM +0000, Stefano Stabellini wrote:
> > > > On Mon, 3 Nov 2014, Will Deacon wrote:
> > > > > On Mon, Nov 03, 2014 at 10:46:03AM +0000, Stefano Stabellini wrote:
> > > > > > On Mon, 27 Oct 2014, Stefano Stabellini wrote:
> > > > > > > Introduce a boolean flag and an accessor function to check whether a
> > > > > > > device is dma_coherent. Set the flag from set_arch_dma_coherent_ops.
> > > > > > >
> > > > > > > Signed-off-by: Stefano Stabellini <[email protected]>
> > > > > > > Signed-off-by: Catalin Marinas <[email protected]>
> > > > > > > CC: [email protected]
> > > > > >
> > > > > > Will, Catalin,
> > > > > > are you OK with this patch?
> > > > >
> > > > > It would be nicer if the dma_coherent flag didn't have to be duplicated by
> > > > > each architecture in dev_archdata. Is there any reason not to put it in the
> > > > > core code?
> > > >
> > > > Yes, there is a reason for it: if I added a boolean dma_coherent flag in
> > > > struct device as Catalin initially suggested, what would be the default
> > > > for each architecture? Where would I set it for arch that don't use
> > > > device tree?
> > >
> > > You don't need to. An architecture that has coherent DMA always doesn't
> > > need to do anything. One that has non-coherent DMA always only needs to
> > > select HAVE_DMA_NONCOHERENT. One that has a mix of both needs to find a
> > > way to set dev->dma_coherent. Since that's a new API you introduce, it
> > > doesn't break any existing architectures.
> >
> > I am not sure that this is better than the current patch but I can see
> > that this approach is not too controversial, so I am happy to go with
> > whatever the maintainers prefer.
>
> Functionally it is the same, but just less code duplication.
>
> > > Note that if !is_device_dma_coherent(), it doesn't always mean that
> > > standard cache maintenance would be enough (but that's a Xen problem,
> > > not sure how to solve).
> >
> > It is a thorny issue indeed.
> > Xen would need to know how to do non-standard cache maintenance
> > operations.
>
> Is EL2 hyp or EL1 dom0 doing such maintenance? If the latter, you could
> just use the currently registered dma ops.

It is Xen, so El2 hypervisor.


> > Otherwise we would need to resurrect XENFEAT_grant_map_identity (that I
> > am reverting in this series) and be content with having CONFIG_XEN_DOM0
> > depend on CONFIG_ARM_LPAE.
>
> So what does buy you? Is it just the hope that with LPAE you won't have
> weird system caches?

Let me explain a bit more and let's assume there is no SMMU.

The problem is not normal dma originating in dom0, currently mapped 1:1
(pseudo-physical == physical). The problem are dma operations that
involve foreign pages (pages belonging to other virtual machines)
currently mapped also in dom0 to do IO. It is common for the Xen PV
network and block frontends to grant access to one or more pages to the
network and block backends for IO. The backends, usually in dom0, map
the pages and perform the actual IO. Sometimes the IO is a dma operation
that involves one of the granted pages directly.

For these pages, the pseudo-physical address in dom0 is different from
the physical address. Dom0 keeps track of the pseudo-physical to
physical relationship (pfn_to_mfn in Xen terminology). The swiotlb-xen
driver makes sure to return the right mfn from map_page and map_sg.

However some of the dma_ops give you only a dma_addr_t handle
(unmap_page and unmap_sg), that is the physical address of the page.
Dom0 doesn't know how to find the pseudo-physical address corresponding
to it. Therefore it also doesn't know how to find the struct page for
it. This is not a trivial problem to solve as the same page can be
granted multiple times, therefore could have multiple pseudo-physical
addresses and multiple struct pages corresponding to one physical
address in dom0.

Fortunately these dma_ops don't actually need to do much. In fact they
only need to flush caches if the device is not dma-coherent. So the
current proposed solution is to issue an hypercall to ask Xen to do the
flushing, passing the physical address dom0 knows.

The older solution that this series is reverting, is based on
XENFEAT_grant_map_identity, that was a feature offered by Xen, already
reverted in the hypervisor. XENFEAT_grant_map_identity told dom0 that
the hypervisor mapped foreign pages at the address requested by dom0
*and* at pseudo-physical == physical address. That way Dom0 could always
find the page at pseudo-physical == physical address and do the flushing
there. However it only works if Dom0 is compiled with CONFIG_ARM_LPAE
(a non-LPAE kernel is not guaranteed to be able to reach the
pseudo-physical address in question) and we didn't want to introduce
that limitation so we changed approach.


> > > diff --git a/arch/Kconfig b/arch/Kconfig
> > > index 05d7a8a458d5..8462b2e7491b 100644
> > > --- a/arch/Kconfig
> > > +++ b/arch/Kconfig
> > > @@ -203,6 +203,9 @@ config HAVE_DMA_ATTRS
> > > config HAVE_DMA_CONTIGUOUS
> > > bool
> > >
> > > +config HAVE_DMA_NONCOHERENT
> > > + bool
> > > +
> > > config GENERIC_SMP_IDLE_THREAD
> > > bool
> > >
> > > diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> > > index 89c4b5ccc68d..fd7d5522764c 100644
> > > --- a/arch/arm/Kconfig
> > > +++ b/arch/arm/Kconfig
> > > @@ -40,6 +40,7 @@ config ARM
> > > select HAVE_DMA_API_DEBUG
> > > select HAVE_DMA_ATTRS
> > > select HAVE_DMA_CONTIGUOUS if MMU
> > > + select HAVE_DMA_NONCOHERENT if OF
> > > select HAVE_DYNAMIC_FTRACE if (!XIP_KERNEL)
> > > select HAVE_EFFICIENT_UNALIGNED_ACCESS if (CPU_V6 || CPU_V6K || CPU_V7) && MMU
> > > select HAVE_FTRACE_MCOUNT_RECORD if (!XIP_KERNEL)
> > > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> > > index 9532f8d5857e..eb7a5aa64e0e 100644
> > > --- a/arch/arm64/Kconfig
> > > +++ b/arch/arm64/Kconfig
> > > @@ -46,6 +46,7 @@ config ARM64
> > > select HAVE_DMA_API_DEBUG
> > > select HAVE_DMA_ATTRS
> > > select HAVE_DMA_CONTIGUOUS
> > > + select HAVE_DMA_NONCOHERENT
> > > select HAVE_DYNAMIC_FTRACE
> > > select HAVE_EFFICIENT_UNALIGNED_ACCESS
> > > select HAVE_FTRACE_MCOUNT_RECORD
> > > diff --git a/drivers/of/platform.c b/drivers/of/platform.c
> > > index 3b64d0bf5bba..7e827726b702 100644
> > > --- a/drivers/of/platform.c
> > > +++ b/drivers/of/platform.c
> > > @@ -183,6 +183,7 @@ static void of_dma_configure(struct device *dev)
> > > * dma coherent operations.
> > > */
> > > if (of_dma_is_coherent(dev->of_node)) {
> > > + dev->dma_coherent = true;
> > > set_arch_dma_coherent_ops(dev);
> > > dev_dbg(dev, "device is dma coherent\n");
> > > }
> >
> > I think that this would need to be #ifdef'ed as it is possible to have
> > OF support but no HAVE_DMA_NONCOHERENT (PPC?).
>
> The field is always there. But with !HAVE_DMA_NONCOHERENT,
> is_device_dma_coherent() would always return 1. You could avoid
> defining is_device_dma_coherent() entirely when !HAVE_DMA_NONCOHERENT,
> it wouldn't be worse than your patch in terms of an undefined function.
>
> > > diff --git a/include/linux/device.h b/include/linux/device.h
> > > index ce1f21608b16..e00ca876db01 100644
> > > --- a/include/linux/device.h
> > > +++ b/include/linux/device.h
> > > @@ -796,6 +796,7 @@ struct device {
> > >
> > > bool offline_disabled:1;
> > > bool offline:1;
> > > + bool dma_coherent:1;
> > > };
> >
> > I guess we would have to #ifdef CONFIG_HAVE_DMA_NONCOHERENT the
> > dma_coherent flag, right? Otherwise architecures that do not select
> > CONFIG_HAVE_DMA_NONCOHERENT (x86 for example) would end up with a flag
> > in struct device that doesn't reflect the properties of the device (dma
> > coherent devices with dev->dma_coherent == 0).
>
> In my proposal you should not read this field directly but rather access
> it only via is_device_dma_coherent() (you can add a function for setting
> it as well).

This is the part that I don't like: on other architectures you now have a
field in struct device that is actually incorrect. It is true that one
should call the accessor function to read the property but still...

As I don't feel that strongly against it, I'll resend including this
patch (with you as the author of course).

2014-11-06 13:40:48

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH v7 3/8] arm64: introduce is_device_dma_coherent

On Thu, Nov 06, 2014 at 12:22:13PM +0000, Stefano Stabellini wrote:
> As I don't feel that strongly against it, I'll resend including this
> patch (with you as the author of course).

Not just yet, let me understand your reply properly ;).

--
Catalin

2014-11-07 11:06:17

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH v7 3/8] arm64: introduce is_device_dma_coherent

On Thu, Nov 06, 2014 at 12:22:13PM +0000, Stefano Stabellini wrote:
> On Thu, 6 Nov 2014, Catalin Marinas wrote:
> > On Wed, Nov 05, 2014 at 06:15:38PM +0000, Stefano Stabellini wrote:
> > > On Wed, 5 Nov 2014, Catalin Marinas wrote:
> > > > Note that if !is_device_dma_coherent(), it doesn't always mean that
> > > > standard cache maintenance would be enough (but that's a Xen problem,
> > > > not sure how to solve).
> > >
> > > It is a thorny issue indeed.
> > > Xen would need to know how to do non-standard cache maintenance
> > > operations.
> >
> > Is EL2 hyp or EL1 dom0 doing such maintenance? If the latter, you could
> > just use the currently registered dma ops.
>
> It is Xen, so El2 hypervisor.

So what is arch/arm/xen/mm32.c cache maintenance for? Doesn't it run in
dom0?

> > > Otherwise we would need to resurrect XENFEAT_grant_map_identity (that I
> > > am reverting in this series) and be content with having CONFIG_XEN_DOM0
> > > depend on CONFIG_ARM_LPAE.
> >
> > So what does buy you? Is it just the hope that with LPAE you won't have
> > weird system caches?
>
> Let me explain a bit more and let's assume there is no SMMU.
>
> The problem is not normal dma originating in dom0, currently mapped 1:1
> (pseudo-physical == physical). The problem are dma operations that
> involve foreign pages (pages belonging to other virtual machines)
> currently mapped also in dom0 to do IO. It is common for the Xen PV
> network and block frontends to grant access to one or more pages to the
> network and block backends for IO. The backends, usually in dom0, map
> the pages and perform the actual IO. Sometimes the IO is a dma operation
> that involves one of the granted pages directly.
>
> For these pages, the pseudo-physical address in dom0 is different from
> the physical address. Dom0 keeps track of the pseudo-physical to
> physical relationship (pfn_to_mfn in Xen terminology). The swiotlb-xen
> driver makes sure to return the right mfn from map_page and map_sg.

For my understanding, xen_dma_map_page() is called from dom0 and it
simply calls the __generic_dma_ops()->map_page(). The "page" argument
here is the dom0 page and it gets translated to dom0 phys and then to a
bus address via xen_phys_to_bus().

On arm64, __swiotlb_map_page() uses the page structure to get some
pseudo device address which gets converted back to a dom0 virtual
address. The latter is used for cache maintenance by VA in dom0. With
PIPT-like caches, that's fine, you don't need the actual machine PA
unless you have caches like PL310 (which are not compatible with
virtualisation anyway and the latest ARMv8 ARM even makes a clear
statement here).

(one potential problem here is the dma_capable() check in
swiotlb_map_page() (non-xen) which uses the pseudo device address)

The interesting part is that xen_swiotlb_map_page() returns the real
machine device address to dom0, which is different from what dom0 thinks
of a device address (phys_to_dma(phys_to_page(page))). Does dom0 use
such real/machine device address to program the device directly or there
is another layer where you could do the translation?

> However some of the dma_ops give you only a dma_addr_t handle
> (unmap_page and unmap_sg), that is the physical address of the page.

OK, I started to get it now, the __swiotlb_unmap_page() on arm64, if
called from dom0, would get the machine device address and dma_to_phys()
does not work.

> Dom0 doesn't know how to find the pseudo-physical address corresponding
> to it. Therefore it also doesn't know how to find the struct page for
> it. This is not a trivial problem to solve as the same page can be
> granted multiple times, therefore could have multiple pseudo-physical
> addresses and multiple struct pages corresponding to one physical
> address in dom0.

So what is mfn_to_pfn() and xen_bus_to_phys() used for? Isn't
mfn_to_pfn(pfn_to_mfn(x)) an identity function?

> Fortunately these dma_ops don't actually need to do much. In fact they
> only need to flush caches if the device is not dma-coherent. So the
> current proposed solution is to issue an hypercall to ask Xen to do the
> flushing, passing the physical address dom0 knows.

I really don't like the dma_cache_maint() duplication you added in
arch/arm/xen/mm32.c just because you cannot call the host dma_ops when
the page is not available.

You basically only need a VA that was mapped in dom0 when map_page() was
called and is still mapped when page_unmap() is called. You can free the
actual mapping after calling __generic_dma_ops()->unmap_page() in
xen_dma_unmap_page() (in xen_unmap_single()).

BTW, drivers/xen/swiotlb-xen.c needs something like:

dma_addr_t dev_addr = xen_phys_to_bus(phys_to_dma(phys));

phys != bus address.

--
Catalin

2014-11-07 11:11:46

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH v7 3/8] arm64: introduce is_device_dma_coherent

On Fri, Nov 07, 2014 at 11:05:25AM +0000, Catalin Marinas wrote:
> BTW, drivers/xen/swiotlb-xen.c needs something like:
>
> dma_addr_t dev_addr = xen_phys_to_bus(phys_to_dma(phys));
>
> phys != bus address.

Wrong place, what I meant is when calling:

xen_dma_unmap_page(hwdev, phys_to_dma(paddr), size, dir, attrs);

--
Catalin

2014-11-07 14:12:12

by Stefano Stabellini

[permalink] [raw]
Subject: Re: [PATCH v7 3/8] arm64: introduce is_device_dma_coherent

On Fri, 7 Nov 2014, Catalin Marinas wrote:
> On Thu, Nov 06, 2014 at 12:22:13PM +0000, Stefano Stabellini wrote:
> > On Thu, 6 Nov 2014, Catalin Marinas wrote:
> > > On Wed, Nov 05, 2014 at 06:15:38PM +0000, Stefano Stabellini wrote:
> > > > On Wed, 5 Nov 2014, Catalin Marinas wrote:
> > > > > Note that if !is_device_dma_coherent(), it doesn't always mean that
> > > > > standard cache maintenance would be enough (but that's a Xen problem,
> > > > > not sure how to solve).
> > > >
> > > > It is a thorny issue indeed.
> > > > Xen would need to know how to do non-standard cache maintenance
> > > > operations.
> > >
> > > Is EL2 hyp or EL1 dom0 doing such maintenance? If the latter, you could
> > > just use the currently registered dma ops.
> >
> > It is Xen, so El2 hypervisor.
>
> So what is arch/arm/xen/mm32.c cache maintenance for? Doesn't it run in
> dom0?

This patch series changes that code path specifically. As a matter of
fact it removes mm32.c.
Before this patch series, cache maintenance in dom0 was being done with
XENFEAT_grant_map_identity (see explanation in previous email).
After this patch series, an hypercall is made when foreign pages are
involved, see patch 8/8, the if(!pfn_valid(pfn)) code path. Local pages
are still flushed in dom0.


> > > > Otherwise we would need to resurrect XENFEAT_grant_map_identity (that I
> > > > am reverting in this series) and be content with having CONFIG_XEN_DOM0
> > > > depend on CONFIG_ARM_LPAE.
> > >
> > > So what does buy you? Is it just the hope that with LPAE you won't have
> > > weird system caches?
> >
> > Let me explain a bit more and let's assume there is no SMMU.
> >
> > The problem is not normal dma originating in dom0, currently mapped 1:1
> > (pseudo-physical == physical). The problem are dma operations that
> > involve foreign pages (pages belonging to other virtual machines)
> > currently mapped also in dom0 to do IO. It is common for the Xen PV
> > network and block frontends to grant access to one or more pages to the
> > network and block backends for IO. The backends, usually in dom0, map
> > the pages and perform the actual IO. Sometimes the IO is a dma operation
> > that involves one of the granted pages directly.
> >
> > For these pages, the pseudo-physical address in dom0 is different from
> > the physical address. Dom0 keeps track of the pseudo-physical to
> > physical relationship (pfn_to_mfn in Xen terminology). The swiotlb-xen
> > driver makes sure to return the right mfn from map_page and map_sg.
>
> For my understanding, xen_dma_map_page() is called from dom0 and it
> simply calls the __generic_dma_ops()->map_page(). The "page" argument
> here is the dom0 page and it gets translated to dom0 phys and then to a
> bus address via xen_phys_to_bus().
>
> On arm64, __swiotlb_map_page() uses the page structure to get some
> pseudo device address which gets converted back to a dom0 virtual
> address. The latter is used for cache maintenance by VA in dom0. With
> PIPT-like caches, that's fine, you don't need the actual machine PA
> unless you have caches like PL310 (which are not compatible with
> virtualisation anyway and the latest ARMv8 ARM even makes a clear
> statement here).
>
> (one potential problem here is the dma_capable() check in
> swiotlb_map_page() (non-xen) which uses the pseudo device address)
>
> The interesting part is that xen_swiotlb_map_page() returns the real
> machine device address to dom0, which is different from what dom0 thinks
> of a device address (phys_to_dma(phys_to_page(page))). Does dom0 use
> such real/machine device address to program the device directly or there
> is another layer where you could do the translation?

No other layer. Dom0 uses the real/machine device address returned by
xen_phys_to_bus to program the device directly.


> > However some of the dma_ops give you only a dma_addr_t handle
> > (unmap_page and unmap_sg), that is the physical address of the page.
>
> OK, I started to get it now, the __swiotlb_unmap_page() on arm64, if
> called from dom0, would get the machine device address and dma_to_phys()
> does not work.

Right!


> > Dom0 doesn't know how to find the pseudo-physical address corresponding
> > to it. Therefore it also doesn't know how to find the struct page for
> > it. This is not a trivial problem to solve as the same page can be
> > granted multiple times, therefore could have multiple pseudo-physical
> > addresses and multiple struct pages corresponding to one physical
> > address in dom0.
>
> So what is mfn_to_pfn() and xen_bus_to_phys() used for?

They are not used on ARM. But swiotlb-xen is not ARM specific and
xen_bus_to_phys is still important on x86. It works differently there.


> Isn't mfn_to_pfn(pfn_to_mfn(x)) an identity function?

Yes, it should be and it is for local pages.
However it is not an identity function for foreign pages, because
mfn_to_pfn is broken and just returns pfn on ARM, while pfn_to_mfn works
correctly.


> > Fortunately these dma_ops don't actually need to do much. In fact they
> > only need to flush caches if the device is not dma-coherent. So the
> > current proposed solution is to issue an hypercall to ask Xen to do the
> > flushing, passing the physical address dom0 knows.
>
> I really don't like the dma_cache_maint() duplication you added in
> arch/arm/xen/mm32.c just because you cannot call the host dma_ops when
> the page is not available.

I agree is bad, but this patch series is a big step forward removing the
duplication. In fact now that I think about it, when a page is local (not a
foreign page) I could just call the host dma_ops. mm.c:dma_cache_maint
would only be used for foreign pages (pages for which !pfn_valid(pfn)).


> You basically only need a VA that was mapped in dom0 when map_page() was
> called and is still mapped when page_unmap() is called. You can free the
> actual mapping after calling __generic_dma_ops()->unmap_page() in
> xen_dma_unmap_page() (in xen_unmap_single()).

That is true, but where would I store the mapping?

I don't want to introduce another data structure to keep physical to va
or physical to struct page mappings that I need to modify at every
map_page and unmap_page. It wouldn't even be a trivial data structure as
multiple dma operations involving the same mfn but different struct page
can happen simultaneously.

The performance hit of maintaining such a data structure would be
significant.

It would negatively impact any dma operations involving any devices,
including dma coherent ones. While this series only requires special
handling of dma operations involving non-coherent devices (resulting in
an hypercall being made).

In a way this series rewards hardware that makes life easier to software
developers :-)

2014-11-07 14:12:32

by Stefano Stabellini

[permalink] [raw]
Subject: Re: [PATCH v7 3/8] arm64: introduce is_device_dma_coherent

On Fri, 7 Nov 2014, Catalin Marinas wrote:
> On Fri, Nov 07, 2014 at 11:05:25AM +0000, Catalin Marinas wrote:
> > BTW, drivers/xen/swiotlb-xen.c needs something like:
> >
> > dma_addr_t dev_addr = xen_phys_to_bus(phys_to_dma(phys));
> >
> > phys != bus address.
>
> Wrong place, what I meant is when calling:
>
> xen_dma_unmap_page(hwdev, phys_to_dma(paddr), size, dir, attrs);

Well spotted! This is an actual mistake in the current code, independent
from the problem described before. Julien reported it a couple of days
ago and I have already queued a patch to fix it at the end of the
series.

2014-11-07 14:22:17

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH v7 6/8] xen/arm/arm64: merge xen/mm32.c into xen/mm.c

On Mon, Oct 27, 2014 at 03:09:26PM +0000, Stefano Stabellini wrote:
> Merge xen/mm32.c into xen/mm.c.
> As a consequence the code gets compiled on arm64 too: introduce a few
> compat functions to actually be able to compile it.
>
> Signed-off-by: Stefano Stabellini <[email protected]>
> Acked-by: Ian Campbell <[email protected]>

Since I missed the commit introducing mm32.c (340720be32d4 xen/arm:
reimplement xen_dma_unmap_page & friends), I'll add a retrospective NAK ;).

The main reason is the asymmetry between dma map and unmap. With host
swiotlb somehow getting !dma_capable(dev), you even risk leaking dom0
swiotlb bounce buffers (on arm64).

> --- a/arch/arm/xen/mm.c
> +++ b/arch/arm/xen/mm.c
[...]
> +/* functions called by SWIOTLB */
> +
> +static void dma_cache_maint(dma_addr_t handle, unsigned long offset,
> + size_t size, enum dma_data_direction dir,
> + void (*op)(const void *, size_t, int))
> +{
> + unsigned long pfn;
> + size_t left = size;
> +
> + pfn = (handle >> PAGE_SHIFT) + offset / PAGE_SIZE;
> + offset %= PAGE_SIZE;
> +
> + do {
> + size_t len = left;
> + void *vaddr;
> +
> + if (!pfn_valid(pfn))

Is this the pfn or the mfn? As you said in the previous email, there is
no mfn_to_pfn() conversion, so that's actually in another address space
where dom0 pfn_valid() would not make sense.

Do you use this as a check for foreign pages? If yes, is the mfn for
such pages guaranteed to be different from any valid dom0 pfn?

> + {
> + /* TODO: cache flush */

What does this TODO mean here? Which cases are not covered yet?

> + } else {
> + struct page *page = pfn_to_page(pfn);

If the pfn here is correct, can you not just have something like:

void xen_dma_unmap_page(struct device *hwdev, dma_addr_t handle,
size_t size, enum dma_data_direction dir,
struct dma_attrs *attrs)

{
unsigned long pfn = handle >> PAGE_SHIFT; /* could use some macros */
if (!pfn_valid(pfn)) {
/* FIXME */
return;
}
__generic_dma_ops(hwdev)->unmap_page(hwdev, handle, size, dir, attrs);
}

--
Catalin

2014-11-07 15:36:16

by Stefano Stabellini

[permalink] [raw]
Subject: Re: [PATCH v7 6/8] xen/arm/arm64: merge xen/mm32.c into xen/mm.c

On Fri, 7 Nov 2014, Catalin Marinas wrote:
> On Mon, Oct 27, 2014 at 03:09:26PM +0000, Stefano Stabellini wrote:
> > Merge xen/mm32.c into xen/mm.c.
> > As a consequence the code gets compiled on arm64 too: introduce a few
> > compat functions to actually be able to compile it.
> >
> > Signed-off-by: Stefano Stabellini <[email protected]>
> > Acked-by: Ian Campbell <[email protected]>
>
> Since I missed the commit introducing mm32.c (340720be32d4 xen/arm:
> reimplement xen_dma_unmap_page & friends), I'll add a retrospective NAK ;).
>
> The main reason is the asymmetry between dma map and unmap. With host
> swiotlb somehow getting !dma_capable(dev), you even risk leaking dom0
> swiotlb bounce buffers (on arm64).
>
> > --- a/arch/arm/xen/mm.c
> > +++ b/arch/arm/xen/mm.c
> [...]
> > +/* functions called by SWIOTLB */
> > +
> > +static void dma_cache_maint(dma_addr_t handle, unsigned long offset,
> > + size_t size, enum dma_data_direction dir,
> > + void (*op)(const void *, size_t, int))
> > +{
> > + unsigned long pfn;
> > + size_t left = size;
> > +
> > + pfn = (handle >> PAGE_SHIFT) + offset / PAGE_SIZE;
> > + offset %= PAGE_SIZE;
> > +
> > + do {
> > + size_t len = left;
> > + void *vaddr;
> > +
> > + if (!pfn_valid(pfn))
>
> Is this the pfn or the mfn? As you said in the previous email, there is
> no mfn_to_pfn() conversion, so that's actually in another address space
> where dom0 pfn_valid() would not make sense.

That is actually the mfn. The check works because dom0 is mapped 1:1, so
if the mfn is a valid pfn, then it means that it is a local page.


> Do you use this as a check for foreign pages? If yes, is the mfn for
> such pages guaranteed to be different from any valid dom0 pfn?

Yes and yes


> > + {
> > + /* TODO: cache flush */
>
> What does this TODO mean here? Which cases are not covered yet?

We are going to fill in the cache flush implementation for foreign pages
later, in patch 8/8.


> > + } else {
> > + struct page *page = pfn_to_page(pfn);
>
> If the pfn here is correct, can you not just have something like:
>
> void xen_dma_unmap_page(struct device *hwdev, dma_addr_t handle,
> size_t size, enum dma_data_direction dir,
> struct dma_attrs *attrs)
>
> {
> unsigned long pfn = handle >> PAGE_SHIFT; /* could use some macros */
> if (!pfn_valid(pfn)) {
> /* FIXME */
> return;
> }
> __generic_dma_ops(hwdev)->unmap_page(hwdev, handle, size, dir, attrs);
> }

Yes, this is possible. Then in patch 8/8 I could remove the FIXME and
add a call to a function that issues the new GNTTABOP_cache_flush
hypercall. It would also remove the asymmetry you mentioned before
because we could do the same for map_page.

2014-11-07 16:00:19

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH v7 3/8] arm64: introduce is_device_dma_coherent

(talking to Ian in person helped a bit ;))

On Fri, Nov 07, 2014 at 02:10:25PM +0000, Stefano Stabellini wrote:
> On Fri, 7 Nov 2014, Catalin Marinas wrote:
> > On Thu, Nov 06, 2014 at 12:22:13PM +0000, Stefano Stabellini wrote:
> > > On Thu, 6 Nov 2014, Catalin Marinas wrote:
> > > > On Wed, Nov 05, 2014 at 06:15:38PM +0000, Stefano Stabellini wrote:
> > > > > On Wed, 5 Nov 2014, Catalin Marinas wrote:
> > > > > > Note that if !is_device_dma_coherent(), it doesn't always mean that
> > > > > > standard cache maintenance would be enough (but that's a Xen problem,
> > > > > > not sure how to solve).
> > > > >
> > > > > It is a thorny issue indeed.
> > > > > Xen would need to know how to do non-standard cache maintenance
> > > > > operations.
> > > >
> > > > Is EL2 hyp or EL1 dom0 doing such maintenance? If the latter, you could
> > > > just use the currently registered dma ops.
> > >
> > > It is Xen, so El2 hypervisor.
> >
> > So what is arch/arm/xen/mm32.c cache maintenance for? Doesn't it run in
> > dom0?
>
> This patch series changes that code path specifically. As a matter of
> fact it removes mm32.c.

Well, it actually merges it into another file, dma_cache_maint() still
remains.

> Before this patch series, cache maintenance in dom0 was being done with
> XENFEAT_grant_map_identity (see explanation in previous email).
> After this patch series, an hypercall is made when foreign pages are
> involved, see patch 8/8, the if(!pfn_valid(pfn)) code path. Local pages
> are still flushed in dom0.

OK. One question I had though is whether pfn_valid(mfn) is always false for
foreign pages. IIUC, dom0 uses a 1:1 pfn:mfn mapping, so it would work.
But from what I gather, there can be other guest, not necessarily dom0,
handling the DMA in which case, if it doesn't use a 1:1 pfn:mfn mapping,
your pfn_valid(mfn) check would no longer work for foreign pages.

> > > Fortunately these dma_ops don't actually need to do much. In fact they
> > > only need to flush caches if the device is not dma-coherent. So the
> > > current proposed solution is to issue an hypercall to ask Xen to do the
> > > flushing, passing the physical address dom0 knows.
> >
> > I really don't like the dma_cache_maint() duplication you added in
> > arch/arm/xen/mm32.c just because you cannot call the host dma_ops when
> > the page is not available.
>
> I agree is bad, but this patch series is a big step forward removing the
> duplication. In fact now that I think about it, when a page is local (not a
> foreign page) I could just call the host dma_ops. mm.c:dma_cache_maint
> would only be used for foreign pages (pages for which !pfn_valid(pfn)).

Indeed. I already replied to patch 6/8. This way I think you can remove
dma_cache_maint() duplication entirely, just add one specific to foreign
pages.

What I would like to see is xen_dma_map_page() also using hyp calls for
cache maintenance when !pfn_valid(), for symmetry with unmap. You would
need another argument to xen_dma_map_page() though to pass the real
device address or mfn (and on the map side you could simply check for
page_to_pfn(page) != mfn). For such cases, Xen swiotlb already handles
bouncing so you don't need dom0 swiotlb involved as well.

> > You basically only need a VA that was mapped in dom0 when map_page() was
> > called and is still mapped when page_unmap() is called. You can free the
> > actual mapping after calling __generic_dma_ops()->unmap_page() in
> > xen_dma_unmap_page() (in xen_unmap_single()).
>
> That is true, but where would I store the mapping?

dev_archdata ;).

> I don't want to introduce another data structure to keep physical to va
> or physical to struct page mappings that I need to modify at every
> map_page and unmap_page. It wouldn't even be a trivial data structure as
> multiple dma operations involving the same mfn but different struct page
> can happen simultaneously.
>
> The performance hit of maintaining such a data structure would be
> significant.

There would indeed be a performance hit especially for sg ops.

> It would negatively impact any dma operations involving any devices,
> including dma coherent ones. While this series only requires special
> handling of dma operations involving non-coherent devices (resulting in
> an hypercall being made).
>
> In a way this series rewards hardware that makes life easier to software
> developers :-)

I only need the pfn_valid() clarification now together with the
map/unmap symmetry fix.

I think for the time being is_device_dma_coherent() can stay as an
arch-only function as it is only used by Xen on arm/arm64. If we later
need this on other architectures, we can make it generic.

--
Catalin

2014-11-07 16:03:17

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH v7 6/8] xen/arm/arm64: merge xen/mm32.c into xen/mm.c

On Fri, Nov 07, 2014 at 03:28:38PM +0000, Stefano Stabellini wrote:
> On Fri, 7 Nov 2014, Catalin Marinas wrote:
> > On Mon, Oct 27, 2014 at 03:09:26PM +0000, Stefano Stabellini wrote:
> > > Merge xen/mm32.c into xen/mm.c.
> > > As a consequence the code gets compiled on arm64 too: introduce a few
> > > compat functions to actually be able to compile it.
> > >
> > > Signed-off-by: Stefano Stabellini <[email protected]>
> > > Acked-by: Ian Campbell <[email protected]>
> >
> > Since I missed the commit introducing mm32.c (340720be32d4 xen/arm:
> > reimplement xen_dma_unmap_page & friends), I'll add a retrospective NAK ;).
> >
> > The main reason is the asymmetry between dma map and unmap. With host
> > swiotlb somehow getting !dma_capable(dev), you even risk leaking dom0
> > swiotlb bounce buffers (on arm64).
> >
> > > --- a/arch/arm/xen/mm.c
> > > +++ b/arch/arm/xen/mm.c
> > [...]
> > > +/* functions called by SWIOTLB */
> > > +
> > > +static void dma_cache_maint(dma_addr_t handle, unsigned long offset,
> > > + size_t size, enum dma_data_direction dir,
> > > + void (*op)(const void *, size_t, int))
> > > +{
> > > + unsigned long pfn;
> > > + size_t left = size;
> > > +
> > > + pfn = (handle >> PAGE_SHIFT) + offset / PAGE_SIZE;
> > > + offset %= PAGE_SIZE;
> > > +
> > > + do {
> > > + size_t len = left;
> > > + void *vaddr;
> > > +
> > > + if (!pfn_valid(pfn))
> >
> > Is this the pfn or the mfn? As you said in the previous email, there is
> > no mfn_to_pfn() conversion, so that's actually in another address space
> > where dom0 pfn_valid() would not make sense.
>
> That is actually the mfn. The check works because dom0 is mapped 1:1, so
> if the mfn is a valid pfn, then it means that it is a local page.

So the Xen DMA ops would never be called on anything other than dom0? If
that's correct, the pfn_valid() check would work. But add some big
comments as it's not clear at all to someone not familiar with Xen.

--
Catalin

2014-11-07 16:48:31

by Stefano Stabellini

[permalink] [raw]
Subject: Re: [PATCH v7 6/8] xen/arm/arm64: merge xen/mm32.c into xen/mm.c

On Fri, 7 Nov 2014, Catalin Marinas wrote:
> On Fri, Nov 07, 2014 at 03:28:38PM +0000, Stefano Stabellini wrote:
> > On Fri, 7 Nov 2014, Catalin Marinas wrote:
> > > On Mon, Oct 27, 2014 at 03:09:26PM +0000, Stefano Stabellini wrote:
> > > > Merge xen/mm32.c into xen/mm.c.
> > > > As a consequence the code gets compiled on arm64 too: introduce a few
> > > > compat functions to actually be able to compile it.
> > > >
> > > > Signed-off-by: Stefano Stabellini <[email protected]>
> > > > Acked-by: Ian Campbell <[email protected]>
> > >
> > > Since I missed the commit introducing mm32.c (340720be32d4 xen/arm:
> > > reimplement xen_dma_unmap_page & friends), I'll add a retrospective NAK ;).
> > >
> > > The main reason is the asymmetry between dma map and unmap. With host
> > > swiotlb somehow getting !dma_capable(dev), you even risk leaking dom0
> > > swiotlb bounce buffers (on arm64).
> > >
> > > > --- a/arch/arm/xen/mm.c
> > > > +++ b/arch/arm/xen/mm.c
> > > [...]
> > > > +/* functions called by SWIOTLB */
> > > > +
> > > > +static void dma_cache_maint(dma_addr_t handle, unsigned long offset,
> > > > + size_t size, enum dma_data_direction dir,
> > > > + void (*op)(const void *, size_t, int))
> > > > +{
> > > > + unsigned long pfn;
> > > > + size_t left = size;
> > > > +
> > > > + pfn = (handle >> PAGE_SHIFT) + offset / PAGE_SIZE;
> > > > + offset %= PAGE_SIZE;
> > > > +
> > > > + do {
> > > > + size_t len = left;
> > > > + void *vaddr;
> > > > +
> > > > + if (!pfn_valid(pfn))
> > >
> > > Is this the pfn or the mfn? As you said in the previous email, there is
> > > no mfn_to_pfn() conversion, so that's actually in another address space
> > > where dom0 pfn_valid() would not make sense.
> >
> > That is actually the mfn. The check works because dom0 is mapped 1:1, so
> > if the mfn is a valid pfn, then it means that it is a local page.
>
> So the Xen DMA ops would never be called on anything other than dom0?

That's right.


> If that's correct, the pfn_valid() check would work. But add some big
> comments as it's not clear at all to someone not familiar with Xen.

Yeah...

2014-11-07 17:38:02

by Stefano Stabellini

[permalink] [raw]
Subject: Re: [PATCH v7 3/8] arm64: introduce is_device_dma_coherent

On Fri, 7 Nov 2014, Catalin Marinas wrote:
> (talking to Ian in person helped a bit ;))
>
> On Fri, Nov 07, 2014 at 02:10:25PM +0000, Stefano Stabellini wrote:
> > On Fri, 7 Nov 2014, Catalin Marinas wrote:
> > > On Thu, Nov 06, 2014 at 12:22:13PM +0000, Stefano Stabellini wrote:
> > > > On Thu, 6 Nov 2014, Catalin Marinas wrote:
> > > > > On Wed, Nov 05, 2014 at 06:15:38PM +0000, Stefano Stabellini wrote:
> > > > > > On Wed, 5 Nov 2014, Catalin Marinas wrote:
> > > > > > > Note that if !is_device_dma_coherent(), it doesn't always mean that
> > > > > > > standard cache maintenance would be enough (but that's a Xen problem,
> > > > > > > not sure how to solve).
> > > > > >
> > > > > > It is a thorny issue indeed.
> > > > > > Xen would need to know how to do non-standard cache maintenance
> > > > > > operations.
> > > > >
> > > > > Is EL2 hyp or EL1 dom0 doing such maintenance? If the latter, you could
> > > > > just use the currently registered dma ops.
> > > >
> > > > It is Xen, so El2 hypervisor.
> > >
> > > So what is arch/arm/xen/mm32.c cache maintenance for? Doesn't it run in
> > > dom0?
> >
> > This patch series changes that code path specifically. As a matter of
> > fact it removes mm32.c.
>
> Well, it actually merges it into another file, dma_cache_maint() still
> remains.
>
> > Before this patch series, cache maintenance in dom0 was being done with
> > XENFEAT_grant_map_identity (see explanation in previous email).
> > After this patch series, an hypercall is made when foreign pages are
> > involved, see patch 8/8, the if(!pfn_valid(pfn)) code path. Local pages
> > are still flushed in dom0.
>
> OK. One question I had though is whether pfn_valid(mfn) is always false for
> foreign pages. IIUC, dom0 uses a 1:1 pfn:mfn mapping, so it would work.

Right


> But from what I gather, there can be other guest, not necessarily dom0,
> handling the DMA in which case, if it doesn't use a 1:1 pfn:mfn mapping,
> your pfn_valid(mfn) check would no longer work for foreign pages.

We don't support assigning devices to guests other than dom0 without an
SMMU. With an SMMU it should all be transparent. Either way the
pfn_valid check should always work.



> > > > Fortunately these dma_ops don't actually need to do much. In fact they
> > > > only need to flush caches if the device is not dma-coherent. So the
> > > > current proposed solution is to issue an hypercall to ask Xen to do the
> > > > flushing, passing the physical address dom0 knows.
> > >
> > > I really don't like the dma_cache_maint() duplication you added in
> > > arch/arm/xen/mm32.c just because you cannot call the host dma_ops when
> > > the page is not available.
> >
> > I agree is bad, but this patch series is a big step forward removing the
> > duplication. In fact now that I think about it, when a page is local (not a
> > foreign page) I could just call the host dma_ops. mm.c:dma_cache_maint
> > would only be used for foreign pages (pages for which !pfn_valid(pfn)).
>
> Indeed. I already replied to patch 6/8. This way I think you can remove
> dma_cache_maint() duplication entirely, just add one specific to foreign
> pages.

Yes, that was a very good suggestion, the code is much cleaner now.
Thanks!


> What I would like to see is xen_dma_map_page() also using hyp calls for
> cache maintenance when !pfn_valid(), for symmetry with unmap. You would
> need another argument to xen_dma_map_page() though to pass the real
> device address or mfn (and on the map side you could simply check for
> page_to_pfn(page) != mfn). For such cases, Xen swiotlb already handles
> bouncing so you don't need dom0 swiotlb involved as well.

I can see that it would be nice to have map_page and unmap_page be
symmetrical. However actually doing the map_page flush with an hypercall
would slow things down. Hypercalls are slower than function calls. It is
faster to do the cache flushing in dom0 if possible. In the map_page
case we have the struct page so we can easily do it by calling the
native dma_ops function.

Maybe I could just add a comment to explain the reason for the asymmetry?


> > > You basically only need a VA that was mapped in dom0 when map_page() was
> > > called and is still mapped when page_unmap() is called. You can free the
> > > actual mapping after calling __generic_dma_ops()->unmap_page() in
> > > xen_dma_unmap_page() (in xen_unmap_single()).
> >
> > That is true, but where would I store the mapping?
>
> dev_archdata ;).

Uhm... I haven't thought about using dev_archdata. It might simplify the
problem a bit. I'll have to think it over. It could be 3.19 or 3.20
material.


> > I don't want to introduce another data structure to keep physical to va
> > or physical to struct page mappings that I need to modify at every
> > map_page and unmap_page. It wouldn't even be a trivial data structure as
> > multiple dma operations involving the same mfn but different struct page
> > can happen simultaneously.
> >
> > The performance hit of maintaining such a data structure would be
> > significant.
>
> There would indeed be a performance hit especially for sg ops.
>
> > It would negatively impact any dma operations involving any devices,
> > including dma coherent ones. While this series only requires special
> > handling of dma operations involving non-coherent devices (resulting in
> > an hypercall being made).
> >
> > In a way this series rewards hardware that makes life easier to software
> > developers :-)
>
> I only need the pfn_valid() clarification now together with the
> map/unmap symmetry fix.
>
> I think for the time being is_device_dma_coherent() can stay as an
> arch-only function as it is only used by Xen on arm/arm64. If we later
> need this on other architectures, we can make it generic.

OK

2014-11-07 17:48:40

by Stefano Stabellini

[permalink] [raw]
Subject: Re: [PATCH v7 3/8] arm64: introduce is_device_dma_coherent

On Fri, 7 Nov 2014, Stefano Stabellini wrote:
> On Fri, 7 Nov 2014, Catalin Marinas wrote:
> > What I would like to see is xen_dma_map_page() also using hyp calls for
> > cache maintenance when !pfn_valid(), for symmetry with unmap. You would
> > need another argument to xen_dma_map_page() though to pass the real
> > device address or mfn (and on the map side you could simply check for
> > page_to_pfn(page) != mfn). For such cases, Xen swiotlb already handles
> > bouncing so you don't need dom0 swiotlb involved as well.
>
> I can see that it would be nice to have map_page and unmap_page be
> symmetrical. However actually doing the map_page flush with an hypercall
> would slow things down. Hypercalls are slower than function calls. It is
> faster to do the cache flushing in dom0 if possible. In the map_page
> case we have the struct page so we can easily do it by calling the
> native dma_ops function.
>
> Maybe I could just add a comment to explain the reason for the asymmetry?

Ah, but the problem is that map_page could allocate a swiotlb buffer
(actually it does on arm64) that without a corresponding unmap_page
call, would end up being leaked, right?

Oh well.. two hypercalls it is then :-/

2014-11-07 18:14:40

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH v7 3/8] arm64: introduce is_device_dma_coherent

On Fri, Nov 07, 2014 at 05:35:41PM +0000, Stefano Stabellini wrote:
> On Fri, 7 Nov 2014, Stefano Stabellini wrote:
> > On Fri, 7 Nov 2014, Catalin Marinas wrote:
> > > What I would like to see is xen_dma_map_page() also using hyp calls for
> > > cache maintenance when !pfn_valid(), for symmetry with unmap. You would
> > > need another argument to xen_dma_map_page() though to pass the real
> > > device address or mfn (and on the map side you could simply check for
> > > page_to_pfn(page) != mfn). For such cases, Xen swiotlb already handles
> > > bouncing so you don't need dom0 swiotlb involved as well.
> >
> > I can see that it would be nice to have map_page and unmap_page be
> > symmetrical. However actually doing the map_page flush with an hypercall
> > would slow things down. Hypercalls are slower than function calls. It is
> > faster to do the cache flushing in dom0 if possible. In the map_page
> > case we have the struct page so we can easily do it by calling the
> > native dma_ops function.
> >
> > Maybe I could just add a comment to explain the reason for the asymmetry?
>
> Ah, but the problem is that map_page could allocate a swiotlb buffer
> (actually it does on arm64) that without a corresponding unmap_page
> call, would end up being leaked, right?

Yes. You could hack dma_capable() to always return true for dom0
(because the pfn/dma address here doesn't have anything to do with the
real mfn) but that's more of a hack assuming a lot about the swiotlb
implementation.

> Oh well.. two hypercalls it is then :-/

That looks cleaner to me (though I haven't seen the code yet ;)).

Alternatively, you could keep a permanent page (per-cpu) in dom0 that
you ask the hypervisor to point at the mfn just for the unmap cache
maintenance (similarly to the kernel kmap_atomic used for cache
maintenance on high pages). This could be more expensive but, OTOH, you
only need the hyp call on the unmap path. The is_device_dma_coherent()
would still remain to avoid any unnecessary map/unmap calls but for
non-coherent devices you use whatever the SoC provides.

--
Catalin

2014-11-07 18:45:37

by Stefano Stabellini

[permalink] [raw]
Subject: Re: [PATCH v7 3/8] arm64: introduce is_device_dma_coherent

On Fri, 7 Nov 2014, Catalin Marinas wrote:
> On Fri, Nov 07, 2014 at 05:35:41PM +0000, Stefano Stabellini wrote:
> > On Fri, 7 Nov 2014, Stefano Stabellini wrote:
> > > On Fri, 7 Nov 2014, Catalin Marinas wrote:
> > > > What I would like to see is xen_dma_map_page() also using hyp calls for
> > > > cache maintenance when !pfn_valid(), for symmetry with unmap. You would
> > > > need another argument to xen_dma_map_page() though to pass the real
> > > > device address or mfn (and on the map side you could simply check for
> > > > page_to_pfn(page) != mfn). For such cases, Xen swiotlb already handles
> > > > bouncing so you don't need dom0 swiotlb involved as well.
> > >
> > > I can see that it would be nice to have map_page and unmap_page be
> > > symmetrical. However actually doing the map_page flush with an hypercall
> > > would slow things down. Hypercalls are slower than function calls. It is
> > > faster to do the cache flushing in dom0 if possible. In the map_page
> > > case we have the struct page so we can easily do it by calling the
> > > native dma_ops function.
> > >
> > > Maybe I could just add a comment to explain the reason for the asymmetry?
> >
> > Ah, but the problem is that map_page could allocate a swiotlb buffer
> > (actually it does on arm64) that without a corresponding unmap_page
> > call, would end up being leaked, right?
>
> Yes. You could hack dma_capable() to always return true for dom0
> (because the pfn/dma address here doesn't have anything to do with the
> real mfn) but that's more of a hack assuming a lot about the swiotlb
> implementation.

Another idea would be to avoid calling the native map_page for foreign
pages, but in the xen specific implementation instead of making the
hypercall, we could call __dma_map_area on arm64 and map_page on arm.

Something like this:


In arch/arm/include/asm/xen/page-coherent.h:

static inline void xen_dma_map_page(struct device *hwdev, struct page *page,
dma_addr_t dev_addr, unsigned long offset, size_t size,
enum dma_data_direction dir, struct dma_attrs *attrs)
{
if (pfn_valid(PFN_DOWN(dev_addr))) {
if (__generic_dma_ops(hwdev)->map_page)
__generic_dma_ops(hwdev)->map_page(hwdev, page, offset, size, dir, attrs);
} else
__xen_dma_map_page(hwdev, page, dev_addr, offset, size, dir, attrs);
}



In arch/arm/xen/mm.c:

void __xen_dma_map_page(struct device *hwdev, struct page *page,
dma_addr_t dev_addr, unsigned long offset, size_t size,
enum dma_data_direction dir, struct dma_attrs *attrs)
{
if (is_device_dma_coherent(hwdev))
return;
#ifdef CONFIG_ARM64
__dma_map_area(page_to_phys(page) + offset, size, dir);
#else
__generic_dma_ops(hwdev)->map_page(hwdev, page, offset, size, dir, attrs);
#endif
}


It wouldn't be as nice as using the hypercall but it would be faster and
wouldn't depend on the inner workings of the arm64 implementation of
map_page, except for __dma_map_area.

2014-11-10 10:17:04

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH v7 3/8] arm64: introduce is_device_dma_coherent

On Fri, Nov 07, 2014 at 06:45:22PM +0000, Stefano Stabellini wrote:
> On Fri, 7 Nov 2014, Catalin Marinas wrote:
> > On Fri, Nov 07, 2014 at 05:35:41PM +0000, Stefano Stabellini wrote:
> > > On Fri, 7 Nov 2014, Stefano Stabellini wrote:
> > > > On Fri, 7 Nov 2014, Catalin Marinas wrote:
> > > > > What I would like to see is xen_dma_map_page() also using hyp calls for
> > > > > cache maintenance when !pfn_valid(), for symmetry with unmap. You would
> > > > > need another argument to xen_dma_map_page() though to pass the real
> > > > > device address or mfn (and on the map side you could simply check for
> > > > > page_to_pfn(page) != mfn). For such cases, Xen swiotlb already handles
> > > > > bouncing so you don't need dom0 swiotlb involved as well.
> > > >
> > > > I can see that it would be nice to have map_page and unmap_page be
> > > > symmetrical. However actually doing the map_page flush with an hypercall
> > > > would slow things down. Hypercalls are slower than function calls. It is
> > > > faster to do the cache flushing in dom0 if possible. In the map_page
> > > > case we have the struct page so we can easily do it by calling the
> > > > native dma_ops function.
> > > >
> > > > Maybe I could just add a comment to explain the reason for the asymmetry?
> > >
> > > Ah, but the problem is that map_page could allocate a swiotlb buffer
> > > (actually it does on arm64) that without a corresponding unmap_page
> > > call, would end up being leaked, right?
> >
> > Yes. You could hack dma_capable() to always return true for dom0
> > (because the pfn/dma address here doesn't have anything to do with the
> > real mfn) but that's more of a hack assuming a lot about the swiotlb
> > implementation.
>
> Another idea would be to avoid calling the native map_page for foreign
> pages, but in the xen specific implementation instead of making the
> hypercall, we could call __dma_map_area on arm64 and map_page on arm.

The problem here is that you assume that for an SoC that is not fully
coherent all it needs is __dma_map_area. If you look at mach-mvebu, the
DMA is nearly cache coherent but it needs some specific synchronisation
barrier at the interconnect level. If we get something like this on a
platform with virtualisation, it would be implemented at the dom0 level
by SoC-specific DMA ops. Xen hypervisor I assume has its own BSP, hence
it could implement SoC specific cache flushing there. But with the mix
of cache flushing in dom0 on map and hypervisor on unmap such thing is
no longer be possible in a nice way.

> In arch/arm/include/asm/xen/page-coherent.h:
>
> static inline void xen_dma_map_page(struct device *hwdev, struct page *page,
> dma_addr_t dev_addr, unsigned long offset, size_t size,
> enum dma_data_direction dir, struct dma_attrs *attrs)
> {
> if (pfn_valid(PFN_DOWN(dev_addr))) {

BTW, pfn_valid() is more expensive than simply comparing the mfn with
pfn for dom0. The calling code knows this already and it may be quicker
to simply pass a "bool foreign" argument.

> It wouldn't be as nice as using the hypercall but it would be faster and
> wouldn't depend on the inner workings of the arm64 implementation of
> map_page, except for __dma_map_area.

This "except" is big as we may at some point get some SoC like mvebu (I
would say less likely for arm64 than arm32).

BTW, does the Xen hypervisor already have a mapping of the mfn? If not
and it has to create one temporarily for the flush, you may not lose
much by using the dom0 ops for unmap with a hyper call for temporarily
mapping the foreign page in dom0's IPA space (you could do the unmapping
lazily).

--
Catalin

2014-11-10 12:35:51

by Stefano Stabellini

[permalink] [raw]
Subject: Re: [PATCH v7 3/8] arm64: introduce is_device_dma_coherent

On Mon, 10 Nov 2014, Catalin Marinas wrote:
> On Fri, Nov 07, 2014 at 06:45:22PM +0000, Stefano Stabellini wrote:
> > On Fri, 7 Nov 2014, Catalin Marinas wrote:
> > > On Fri, Nov 07, 2014 at 05:35:41PM +0000, Stefano Stabellini wrote:
> > > > On Fri, 7 Nov 2014, Stefano Stabellini wrote:
> > > > > On Fri, 7 Nov 2014, Catalin Marinas wrote:
> > > > > > What I would like to see is xen_dma_map_page() also using hyp calls for
> > > > > > cache maintenance when !pfn_valid(), for symmetry with unmap. You would
> > > > > > need another argument to xen_dma_map_page() though to pass the real
> > > > > > device address or mfn (and on the map side you could simply check for
> > > > > > page_to_pfn(page) != mfn). For such cases, Xen swiotlb already handles
> > > > > > bouncing so you don't need dom0 swiotlb involved as well.
> > > > >
> > > > > I can see that it would be nice to have map_page and unmap_page be
> > > > > symmetrical. However actually doing the map_page flush with an hypercall
> > > > > would slow things down. Hypercalls are slower than function calls. It is
> > > > > faster to do the cache flushing in dom0 if possible. In the map_page
> > > > > case we have the struct page so we can easily do it by calling the
> > > > > native dma_ops function.
> > > > >
> > > > > Maybe I could just add a comment to explain the reason for the asymmetry?
> > > >
> > > > Ah, but the problem is that map_page could allocate a swiotlb buffer
> > > > (actually it does on arm64) that without a corresponding unmap_page
> > > > call, would end up being leaked, right?
> > >
> > > Yes. You could hack dma_capable() to always return true for dom0
> > > (because the pfn/dma address here doesn't have anything to do with the
> > > real mfn) but that's more of a hack assuming a lot about the swiotlb
> > > implementation.
> >
> > Another idea would be to avoid calling the native map_page for foreign
> > pages, but in the xen specific implementation instead of making the
> > hypercall, we could call __dma_map_area on arm64 and map_page on arm.
>
> The problem here is that you assume that for an SoC that is not fully
> coherent all it needs is __dma_map_area. If you look at mach-mvebu, the
> DMA is nearly cache coherent but it needs some specific synchronisation
> barrier at the interconnect level. If we get something like this on a
> platform with virtualisation, it would be implemented at the dom0 level
> by SoC-specific DMA ops. Xen hypervisor I assume has its own BSP, hence
> it could implement SoC specific cache flushing there. But with the mix
> of cache flushing in dom0 on map and hypervisor on unmap such thing is
> no longer be possible in a nice way.

Yes, Xen has its own little platform files.

I think you convinced me that having map in Linux and unmap in Xen is
not a long term solid solution.


> > In arch/arm/include/asm/xen/page-coherent.h:
> >
> > static inline void xen_dma_map_page(struct device *hwdev, struct page *page,
> > dma_addr_t dev_addr, unsigned long offset, size_t size,
> > enum dma_data_direction dir, struct dma_attrs *attrs)
> > {
> > if (pfn_valid(PFN_DOWN(dev_addr))) {
>
> BTW, pfn_valid() is more expensive than simply comparing the mfn with
> pfn for dom0. The calling code knows this already and it may be quicker
> to simply pass a "bool foreign" argument.

Good point, I can compare mfn and pfn here.
I cannot do the same in unmap_page and the various sync operations,
because there is no pfn available, but maybe I could use
set_page_private to set a flag. I'll think about it.


> > It wouldn't be as nice as using the hypercall but it would be faster and
> > wouldn't depend on the inner workings of the arm64 implementation of
> > map_page, except for __dma_map_area.
>
> This "except" is big as we may at some point get some SoC like mvebu (I
> would say less likely for arm64 than arm32).
>
> BTW, does the Xen hypervisor already have a mapping of the mfn?

On arm64, it does.
On arm32, we have a pool of mappings, when we run out we start
overwriting old mappings.


> If not
> and it has to create one temporarily for the flush, you may not lose
> much by using the dom0 ops for unmap with a hyper call for temporarily
> mapping the foreign page in dom0's IPA space (you could do the unmapping
> lazily).

That's true. Fortunately on arm64 we are OK.

2014-11-10 12:42:13

by Stefano Stabellini

[permalink] [raw]
Subject: Re: [PATCH v7 3/8] arm64: introduce is_device_dma_coherent

On Mon, 10 Nov 2014, Stefano Stabellini wrote:
> > BTW, pfn_valid() is more expensive than simply comparing the mfn with
> > pfn for dom0. The calling code knows this already and it may be quicker
> > to simply pass a "bool foreign" argument.
>
> Good point, I can compare mfn and pfn here.
> I cannot do the same in unmap_page and the various sync operations,
> because there is no pfn available, but maybe I could use
> set_page_private to set a flag. I'll think about it.

But of course there is no struct page either so I think pfn_valid will
have to stay in unmap and sync