2019-08-16 13:02:49

by Christoph Hellwig

[permalink] [raw]
Subject: swiotlb-xen cleanups

Hi Xen maintainers and friends,

please take a look at this series that cleans up the parts of swiotlb-xen
that deal with non-coherent caches.


2019-08-16 13:03:23

by Christoph Hellwig

[permalink] [raw]
Subject: [PATCH 06/11] swiotlb-xen: always use dma-direct helpers to alloc coherent pages

x86 currently calls alloc_pages, but using dma-direct works as well
there, with the added benefit of using the CMA pool if available.
The biggest advantage is of course to remove a pointless bit of
architecture specific code.

Signed-off-by: Christoph Hellwig <[email protected]>
---
arch/x86/include/asm/xen/page-coherent.h | 16 ----------------
drivers/xen/swiotlb-xen.c | 7 +++----
include/xen/arm/page-coherent.h | 12 ------------
3 files changed, 3 insertions(+), 32 deletions(-)

diff --git a/arch/x86/include/asm/xen/page-coherent.h b/arch/x86/include/asm/xen/page-coherent.h
index 116777e7f387..8ee33c5edded 100644
--- a/arch/x86/include/asm/xen/page-coherent.h
+++ b/arch/x86/include/asm/xen/page-coherent.h
@@ -5,22 +5,6 @@
#include <asm/page.h>
#include <linux/dma-mapping.h>

-static inline void *xen_alloc_coherent_pages(struct device *hwdev, size_t size,
- dma_addr_t *dma_handle, gfp_t flags,
- unsigned long attrs)
-{
- void *vstart = (void*)__get_free_pages(flags, get_order(size));
- *dma_handle = virt_to_phys(vstart);
- return vstart;
-}
-
-static inline void xen_free_coherent_pages(struct device *hwdev, size_t size,
- void *cpu_addr, dma_addr_t dma_handle,
- unsigned long attrs)
-{
- free_pages((unsigned long) cpu_addr, get_order(size));
-}
-
static inline void xen_dma_map_page(struct device *hwdev, struct page *page,
dma_addr_t dev_addr, unsigned long offset, size_t size,
enum dma_data_direction dir, unsigned long attrs) { }
diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index b8808677ae1d..f9dd4cb6e4b3 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -299,8 +299,7 @@ xen_swiotlb_alloc_coherent(struct device *hwdev, size_t size,
* address. In fact on ARM virt_to_phys only works for kernel direct
* mapped RAM memory. Also see comment below.
*/
- ret = xen_alloc_coherent_pages(hwdev, size, dma_handle, flags, attrs);
-
+ ret = dma_direct_alloc(hwdev, size, dma_handle, flags, attrs);
if (!ret)
return ret;

@@ -319,7 +318,7 @@ xen_swiotlb_alloc_coherent(struct device *hwdev, size_t size,
else {
if (xen_create_contiguous_region(phys, order,
fls64(dma_mask), dma_handle) != 0) {
- xen_free_coherent_pages(hwdev, size, ret, (dma_addr_t)phys, attrs);
+ dma_direct_free(hwdev, size, ret, (dma_addr_t)phys, attrs);
return NULL;
}
SetPageXenRemapped(virt_to_page(ret));
@@ -351,7 +350,7 @@ xen_swiotlb_free_coherent(struct device *hwdev, size_t size, void *vaddr,
TestClearPageXenRemapped(virt_to_page(vaddr)))
xen_destroy_contiguous_region(phys, order);

- xen_free_coherent_pages(hwdev, size, vaddr, (dma_addr_t)phys, attrs);
+ dma_direct_free(hwdev, size, vaddr, (dma_addr_t)phys, attrs);
}

/*
diff --git a/include/xen/arm/page-coherent.h b/include/xen/arm/page-coherent.h
index da2cc09c8eda..4294a31305ca 100644
--- a/include/xen/arm/page-coherent.h
+++ b/include/xen/arm/page-coherent.h
@@ -16,18 +16,6 @@ void __xen_dma_sync_single_for_cpu(struct device *hwdev,
void __xen_dma_sync_single_for_device(struct device *hwdev,
dma_addr_t handle, size_t size, enum dma_data_direction dir);

-static inline void *xen_alloc_coherent_pages(struct device *hwdev, size_t size,
- dma_addr_t *dma_handle, gfp_t flags, unsigned long attrs)
-{
- return dma_direct_alloc(hwdev, size, dma_handle, flags, attrs);
-}
-
-static inline void xen_free_coherent_pages(struct device *hwdev, size_t size,
- void *cpu_addr, dma_addr_t dma_handle, unsigned long attrs)
-{
- dma_direct_free(hwdev, size, cpu_addr, dma_handle, attrs);
-}
-
static inline void xen_dma_sync_single_for_cpu(struct device *hwdev,
dma_addr_t handle, size_t size, enum dma_data_direction dir)
{
--
2.20.1

2019-08-16 13:03:23

by Christoph Hellwig

[permalink] [raw]
Subject: [PATCH 07/11] swiotlb-xen: provide a single page-coherent.h header

Merge the various page-coherent.h files into a single one that either
provides prototypes or stubs depending on the need for cache
maintainance.

For extra benefits alo include <xen/page-coherent.h> in the file
actually implementing the interfaces provided.

Signed-off-by: Christoph Hellwig <[email protected]>
---
arch/arm/include/asm/xen/page-coherent.h | 2 --
arch/arm/xen/mm.c | 1 +
arch/arm64/include/asm/xen/page-coherent.h | 2 --
arch/x86/include/asm/xen/page-coherent.h | 22 ------------------
drivers/xen/swiotlb-xen.c | 4 +---
include/Kbuild | 2 +-
include/xen/{arm => }/page-coherent.h | 27 +++++++++++++++++++---
7 files changed, 27 insertions(+), 33 deletions(-)
delete mode 100644 arch/arm/include/asm/xen/page-coherent.h
delete mode 100644 arch/arm64/include/asm/xen/page-coherent.h
delete mode 100644 arch/x86/include/asm/xen/page-coherent.h
rename include/xen/{arm => }/page-coherent.h (76%)

diff --git a/arch/arm/include/asm/xen/page-coherent.h b/arch/arm/include/asm/xen/page-coherent.h
deleted file mode 100644
index 27e984977402..000000000000
--- a/arch/arm/include/asm/xen/page-coherent.h
+++ /dev/null
@@ -1,2 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#include <xen/arm/page-coherent.h>
diff --git a/arch/arm/xen/mm.c b/arch/arm/xen/mm.c
index a59980f1aa54..85482cdda1e5 100644
--- a/arch/arm/xen/mm.c
+++ b/arch/arm/xen/mm.c
@@ -15,6 +15,7 @@
#include <xen/interface/grant_table.h>
#include <xen/interface/memory.h>
#include <xen/page.h>
+#include <xen/page-coherent.h>
#include <xen/swiotlb-xen.h>

#include <asm/cacheflush.h>
diff --git a/arch/arm64/include/asm/xen/page-coherent.h b/arch/arm64/include/asm/xen/page-coherent.h
deleted file mode 100644
index 27e984977402..000000000000
--- a/arch/arm64/include/asm/xen/page-coherent.h
+++ /dev/null
@@ -1,2 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#include <xen/arm/page-coherent.h>
diff --git a/arch/x86/include/asm/xen/page-coherent.h b/arch/x86/include/asm/xen/page-coherent.h
deleted file mode 100644
index 8ee33c5edded..000000000000
--- a/arch/x86/include/asm/xen/page-coherent.h
+++ /dev/null
@@ -1,22 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#ifndef _ASM_X86_XEN_PAGE_COHERENT_H
-#define _ASM_X86_XEN_PAGE_COHERENT_H
-
-#include <asm/page.h>
-#include <linux/dma-mapping.h>
-
-static inline void xen_dma_map_page(struct device *hwdev, struct page *page,
- dma_addr_t dev_addr, unsigned long offset, size_t size,
- enum dma_data_direction dir, unsigned long attrs) { }
-
-static inline void xen_dma_unmap_page(struct device *hwdev, dma_addr_t handle,
- size_t size, enum dma_data_direction dir,
- unsigned long attrs) { }
-
-static inline void xen_dma_sync_single_for_cpu(struct device *hwdev,
- dma_addr_t handle, size_t size, enum dma_data_direction dir) { }
-
-static inline void xen_dma_sync_single_for_device(struct device *hwdev,
- dma_addr_t handle, size_t size, enum dma_data_direction dir) { }
-
-#endif /* _ASM_X86_XEN_PAGE_COHERENT_H */
diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index f9dd4cb6e4b3..7b23929854e7 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -31,12 +31,10 @@
#include <linux/export.h>
#include <xen/swiotlb-xen.h>
#include <xen/page.h>
+#include <xen/page-coherent.h>
#include <xen/xen-ops.h>
#include <xen/hvc-console.h>

-#include <asm/dma-mapping.h>
-#include <asm/xen/page-coherent.h>
-
#include <trace/events/swiotlb.h>
/*
* Used to do a quick range check in swiotlb_tbl_unmap_single and
diff --git a/include/Kbuild b/include/Kbuild
index c38f0d46b267..e2ae52ef9e1e 100644
--- a/include/Kbuild
+++ b/include/Kbuild
@@ -1189,7 +1189,6 @@ header-test- += video/vga.h
header-test- += video/w100fb.h
header-test- += xen/acpi.h
header-test- += xen/arm/hypercall.h
-header-test- += xen/arm/page-coherent.h
header-test- += xen/arm/page.h
header-test- += xen/balloon.h
header-test- += xen/events.h
@@ -1231,6 +1230,7 @@ header-test- += xen/interface/xen.h
header-test- += xen/interface/xenpmu.h
header-test- += xen/mem-reservation.h
header-test- += xen/page.h
+header-test- += xen/page-coherent.h
header-test- += xen/platform_pci.h
header-test- += xen/swiotlb-xen.h
header-test- += xen/xen-front-pgdir-shbuf.h
diff --git a/include/xen/arm/page-coherent.h b/include/xen/page-coherent.h
similarity index 76%
rename from include/xen/arm/page-coherent.h
rename to include/xen/page-coherent.h
index 4294a31305ca..7c32944de051 100644
--- a/include/xen/arm/page-coherent.h
+++ b/include/xen/page-coherent.h
@@ -1,10 +1,12 @@
/* SPDX-License-Identifier: GPL-2.0 */
-#ifndef _XEN_ARM_PAGE_COHERENT_H
-#define _XEN_ARM_PAGE_COHERENT_H
+#ifndef _XEN_PAGE_COHERENT_H
+#define _XEN_PAGE_COHERENT_H

#include <linux/dma-mapping.h>
#include <asm/page.h>

+#if defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_DEVICE) || \
+ defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU)
void __xen_dma_map_page(struct device *hwdev, struct page *page,
dma_addr_t dev_addr, unsigned long offset, size_t size,
enum dma_data_direction dir, unsigned long attrs);
@@ -71,5 +73,24 @@ static inline void xen_dma_unmap_page(struct device *hwdev, dma_addr_t handle,
else
__xen_dma_unmap_page(hwdev, handle, size, dir, attrs);
}
+#else
+static inline void xen_dma_map_page(struct device *hwdev, struct page *page,
+ dma_addr_t dev_addr, unsigned long offset, size_t size,
+ enum dma_data_direction dir, unsigned long attrs)
+{
+}
+static inline void xen_dma_unmap_page(struct device *hwdev, dma_addr_t handle,
+ size_t size, enum dma_data_direction dir, unsigned long attrs)
+{
+}
+static inline void xen_dma_sync_single_for_cpu(struct device *hwdev,
+ dma_addr_t handle, size_t size, enum dma_data_direction dir)
+{
+}
+static inline void xen_dma_sync_single_for_device(struct device *hwdev,
+ dma_addr_t handle, size_t size, enum dma_data_direction dir)
+{
+}
+#endif

-#endif /* _XEN_ARM_PAGE_COHERENT_H */
+#endif /* _XEN_PAGE_COHERENT_H */
--
2.20.1

2019-08-16 13:03:42

by Christoph Hellwig

[permalink] [raw]
Subject: [PATCH 08/11] swiotlb-xen: use the same foreign page check everywhere

xen_dma_map_page uses a different and more complicated check for
foreign pages than the other three cache maintainance helpers.
Switch it to the simpler pfn_vali method a well.

Signed-off-by: Christoph Hellwig <[email protected]>
---
include/xen/page-coherent.h | 9 ++-------
1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/include/xen/page-coherent.h b/include/xen/page-coherent.h
index 7c32944de051..0f4d468e7a89 100644
--- a/include/xen/page-coherent.h
+++ b/include/xen/page-coherent.h
@@ -43,14 +43,9 @@ static inline void xen_dma_map_page(struct device *hwdev, struct page *page,
dma_addr_t dev_addr, unsigned long offset, size_t size,
enum dma_data_direction dir, unsigned long attrs)
{
- unsigned long page_pfn = page_to_xen_pfn(page);
- unsigned long dev_pfn = XEN_PFN_DOWN(dev_addr);
- unsigned long compound_pages =
- (1<<compound_order(page)) * XEN_PFN_PER_PAGE;
- bool local = (page_pfn <= dev_pfn) &&
- (dev_pfn - page_pfn < compound_pages);
+ unsigned long pfn = PFN_DOWN(dev_addr);

- if (local)
+ if (pfn_valid(pfn))
dma_direct_map_page(hwdev, page, offset, size, dir, attrs);
else
__xen_dma_map_page(hwdev, page, dev_addr, offset, size, dir, attrs);
--
2.20.1

2019-08-16 13:03:43

by Christoph Hellwig

[permalink] [raw]
Subject: [PATCH 01/11] xen/arm: use dma-noncoherent.h calls for xen-swiotlb cache maintainance

Reuse the arm64 code that uses the dma-direct/swiotlb helpers for DMA
non-coherent devices.

Signed-off-by: Christoph Hellwig <[email protected]>
---
arch/arm/Kconfig | 4 +
arch/arm/include/asm/device.h | 3 -
arch/arm/include/asm/xen/page-coherent.h | 93 ----------------------
arch/arm/mm/Kconfig | 4 -
arch/arm/mm/dma-mapping.c | 8 +-
arch/arm64/include/asm/xen/page-coherent.h | 75 -----------------
drivers/xen/swiotlb-xen.c | 49 +-----------
include/xen/arm/page-coherent.h | 71 +++++++++++++++++
8 files changed, 78 insertions(+), 229 deletions(-)

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 33b00579beff..24360211534a 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -7,6 +7,8 @@ config ARM
select ARCH_HAS_BINFMT_FLAT
select ARCH_HAS_DEBUG_VIRTUAL if MMU
select ARCH_HAS_DEVMEM_IS_ALLOWED
+ select ARCH_HAS_DMA_COHERENT_TO_PFN if SWIOTLB
+ select ARCH_HAS_DMA_MMAP_PGPROT if SWIOTLB
select ARCH_HAS_ELF_RANDOMIZE
select ARCH_HAS_FORTIFY_SOURCE
select ARCH_HAS_KEEPINITRD
@@ -18,6 +20,8 @@ config ARM
select ARCH_HAS_SET_MEMORY
select ARCH_HAS_STRICT_KERNEL_RWX if MMU && !XIP_KERNEL
select ARCH_HAS_STRICT_MODULE_RWX if MMU
+ select ARCH_HAS_SYNC_DMA_FOR_DEVICE if SWIOTLB
+ select ARCH_HAS_SYNC_DMA_FOR_CPU if SWIOTLB
select ARCH_HAS_TEARDOWN_DMA_OPS if MMU
select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST
select ARCH_HAVE_CUSTOM_GPIO_H
diff --git a/arch/arm/include/asm/device.h b/arch/arm/include/asm/device.h
index f6955b55c544..c675bc0d5aa8 100644
--- a/arch/arm/include/asm/device.h
+++ b/arch/arm/include/asm/device.h
@@ -14,9 +14,6 @@ struct dev_archdata {
#endif
#ifdef CONFIG_ARM_DMA_USE_IOMMU
struct dma_iommu_mapping *mapping;
-#endif
-#ifdef CONFIG_XEN
- const struct dma_map_ops *dev_dma_ops;
#endif
unsigned int dma_coherent:1;
unsigned int dma_ops_setup:1;
diff --git a/arch/arm/include/asm/xen/page-coherent.h b/arch/arm/include/asm/xen/page-coherent.h
index 2c403e7c782d..27e984977402 100644
--- a/arch/arm/include/asm/xen/page-coherent.h
+++ b/arch/arm/include/asm/xen/page-coherent.h
@@ -1,95 +1,2 @@
/* SPDX-License-Identifier: GPL-2.0 */
-#ifndef _ASM_ARM_XEN_PAGE_COHERENT_H
-#define _ASM_ARM_XEN_PAGE_COHERENT_H
-
-#include <linux/dma-mapping.h>
-#include <asm/page.h>
#include <xen/arm/page-coherent.h>
-
-static inline const struct dma_map_ops *xen_get_dma_ops(struct device *dev)
-{
- if (dev && dev->archdata.dev_dma_ops)
- return dev->archdata.dev_dma_ops;
- return get_arch_dma_ops(NULL);
-}
-
-static inline void *xen_alloc_coherent_pages(struct device *hwdev, size_t size,
- dma_addr_t *dma_handle, gfp_t flags, unsigned long attrs)
-{
- return xen_get_dma_ops(hwdev)->alloc(hwdev, size, dma_handle, flags, attrs);
-}
-
-static inline void xen_free_coherent_pages(struct device *hwdev, size_t size,
- void *cpu_addr, dma_addr_t dma_handle, unsigned long attrs)
-{
- xen_get_dma_ops(hwdev)->free(hwdev, size, cpu_addr, dma_handle, attrs);
-}
-
-static inline void xen_dma_map_page(struct device *hwdev, struct page *page,
- dma_addr_t dev_addr, unsigned long offset, size_t size,
- enum dma_data_direction dir, unsigned long attrs)
-{
- unsigned long page_pfn = page_to_xen_pfn(page);
- unsigned long dev_pfn = XEN_PFN_DOWN(dev_addr);
- unsigned long compound_pages =
- (1<<compound_order(page)) * XEN_PFN_PER_PAGE;
- bool local = (page_pfn <= dev_pfn) &&
- (dev_pfn - page_pfn < compound_pages);
-
- /*
- * Dom0 is mapped 1:1, while the Linux page can span across
- * multiple Xen pages, it's not possible for it to contain a
- * mix of local and foreign Xen pages. So if the first xen_pfn
- * == mfn the page is local otherwise it's a foreign page
- * grant-mapped in dom0. If the page is local we can safely
- * call the native dma_ops function, otherwise we call the xen
- * specific function.
- */
- if (local)
- xen_get_dma_ops(hwdev)->map_page(hwdev, page, offset, size, dir, attrs);
- else
- __xen_dma_map_page(hwdev, page, dev_addr, offset, size, dir, attrs);
-}
-
-static inline void xen_dma_unmap_page(struct device *hwdev, dma_addr_t handle,
- size_t size, enum dma_data_direction dir, unsigned long attrs)
-{
- unsigned long pfn = PFN_DOWN(handle);
- /*
- * Dom0 is mapped 1:1, while the Linux page can be spanned accross
- * multiple Xen page, it's not possible to have a mix of local and
- * foreign Xen page. Dom0 is mapped 1:1, so calling pfn_valid on a
- * foreign mfn will always return false. If the page is local we can
- * safely call the native dma_ops function, otherwise we call the xen
- * specific function.
- */
- if (pfn_valid(pfn)) {
- if (xen_get_dma_ops(hwdev)->unmap_page)
- xen_get_dma_ops(hwdev)->unmap_page(hwdev, handle, size, dir, attrs);
- } else
- __xen_dma_unmap_page(hwdev, handle, size, dir, attrs);
-}
-
-static inline void xen_dma_sync_single_for_cpu(struct device *hwdev,
- dma_addr_t handle, size_t size, enum dma_data_direction dir)
-{
- unsigned long pfn = PFN_DOWN(handle);
- if (pfn_valid(pfn)) {
- if (xen_get_dma_ops(hwdev)->sync_single_for_cpu)
- xen_get_dma_ops(hwdev)->sync_single_for_cpu(hwdev, handle, size, dir);
- } else
- __xen_dma_sync_single_for_cpu(hwdev, handle, size, dir);
-}
-
-static inline void xen_dma_sync_single_for_device(struct device *hwdev,
- dma_addr_t handle, size_t size, enum dma_data_direction dir)
-{
- unsigned long pfn = PFN_DOWN(handle);
- if (pfn_valid(pfn)) {
- if (xen_get_dma_ops(hwdev)->sync_single_for_device)
- xen_get_dma_ops(hwdev)->sync_single_for_device(hwdev, handle, size, dir);
- } else
- __xen_dma_sync_single_for_device(hwdev, handle, size, dir);
-}
-
-#endif /* _ASM_ARM_XEN_PAGE_COHERENT_H */
diff --git a/arch/arm/mm/Kconfig b/arch/arm/mm/Kconfig
index c54cd7ed90ba..c1222c0e9fd3 100644
--- a/arch/arm/mm/Kconfig
+++ b/arch/arm/mm/Kconfig
@@ -664,10 +664,6 @@ config ARM_LPAE
!CPU_32v4 && !CPU_32v3
select PHYS_ADDR_T_64BIT
select SWIOTLB
- select ARCH_HAS_DMA_COHERENT_TO_PFN
- select ARCH_HAS_DMA_MMAP_PGPROT
- select ARCH_HAS_SYNC_DMA_FOR_DEVICE
- select ARCH_HAS_SYNC_DMA_FOR_CPU
help
Say Y if you have an ARMv7 processor supporting the LPAE page
table format and you would like to access memory beyond the
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index d42557ee69c2..738097396445 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -1132,10 +1132,6 @@ static const struct dma_map_ops *arm_get_dma_map_ops(bool coherent)
* 32-bit DMA.
* Use the generic dma-direct / swiotlb ops code in that case, as that
* handles bounce buffering for us.
- *
- * Note: this checks CONFIG_ARM_LPAE instead of CONFIG_SWIOTLB as the
- * latter is also selected by the Xen code, but that code for now relies
- * on non-NULL dev_dma_ops. To be cleaned up later.
*/
if (IS_ENABLED(CONFIG_ARM_LPAE))
return NULL;
@@ -2363,10 +2359,8 @@ void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
set_dma_ops(dev, dma_ops);

#ifdef CONFIG_XEN
- if (xen_initial_domain()) {
- dev->archdata.dev_dma_ops = dev->dma_ops;
+ if (xen_initial_domain())
dev->dma_ops = xen_dma_ops;
- }
#endif
dev->archdata.dma_ops_setup = true;
}
diff --git a/arch/arm64/include/asm/xen/page-coherent.h b/arch/arm64/include/asm/xen/page-coherent.h
index d88e56b90b93..27e984977402 100644
--- a/arch/arm64/include/asm/xen/page-coherent.h
+++ b/arch/arm64/include/asm/xen/page-coherent.h
@@ -1,77 +1,2 @@
/* SPDX-License-Identifier: GPL-2.0 */
-#ifndef _ASM_ARM64_XEN_PAGE_COHERENT_H
-#define _ASM_ARM64_XEN_PAGE_COHERENT_H
-
-#include <linux/dma-mapping.h>
-#include <asm/page.h>
#include <xen/arm/page-coherent.h>
-
-static inline void *xen_alloc_coherent_pages(struct device *hwdev, size_t size,
- dma_addr_t *dma_handle, gfp_t flags, unsigned long attrs)
-{
- return dma_direct_alloc(hwdev, size, dma_handle, flags, attrs);
-}
-
-static inline void xen_free_coherent_pages(struct device *hwdev, size_t size,
- void *cpu_addr, dma_addr_t dma_handle, unsigned long attrs)
-{
- dma_direct_free(hwdev, size, cpu_addr, dma_handle, attrs);
-}
-
-static inline void xen_dma_sync_single_for_cpu(struct device *hwdev,
- dma_addr_t handle, size_t size, enum dma_data_direction dir)
-{
- unsigned long pfn = PFN_DOWN(handle);
-
- if (pfn_valid(pfn))
- dma_direct_sync_single_for_cpu(hwdev, handle, size, dir);
- else
- __xen_dma_sync_single_for_cpu(hwdev, handle, size, dir);
-}
-
-static inline void xen_dma_sync_single_for_device(struct device *hwdev,
- dma_addr_t handle, size_t size, enum dma_data_direction dir)
-{
- unsigned long pfn = PFN_DOWN(handle);
- if (pfn_valid(pfn))
- dma_direct_sync_single_for_device(hwdev, handle, size, dir);
- else
- __xen_dma_sync_single_for_device(hwdev, handle, size, dir);
-}
-
-static inline void xen_dma_map_page(struct device *hwdev, struct page *page,
- dma_addr_t dev_addr, unsigned long offset, size_t size,
- enum dma_data_direction dir, unsigned long attrs)
-{
- unsigned long page_pfn = page_to_xen_pfn(page);
- unsigned long dev_pfn = XEN_PFN_DOWN(dev_addr);
- unsigned long compound_pages =
- (1<<compound_order(page)) * XEN_PFN_PER_PAGE;
- bool local = (page_pfn <= dev_pfn) &&
- (dev_pfn - page_pfn < compound_pages);
-
- if (local)
- dma_direct_map_page(hwdev, page, offset, size, dir, attrs);
- else
- __xen_dma_map_page(hwdev, page, dev_addr, offset, size, dir, attrs);
-}
-
-static inline void xen_dma_unmap_page(struct device *hwdev, dma_addr_t handle,
- size_t size, enum dma_data_direction dir, unsigned long attrs)
-{
- unsigned long pfn = PFN_DOWN(handle);
- /*
- * Dom0 is mapped 1:1, while the Linux page can be spanned accross
- * multiple Xen page, it's not possible to have a mix of local and
- * foreign Xen page. Dom0 is mapped 1:1, so calling pfn_valid on a
- * foreign mfn will always return false. If the page is local we can
- * safely call the native dma_ops function, otherwise we call the xen
- * specific function.
- */
- if (pfn_valid(pfn))
- dma_direct_unmap_page(hwdev, handle, size, dir, attrs);
- else
- __xen_dma_unmap_page(hwdev, handle, size, dir, attrs);
-}
-
-#endif /* _ASM_ARM64_XEN_PAGE_COHERENT_H */
diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index ae1df496bf38..b8808677ae1d 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -547,51 +547,6 @@ xen_swiotlb_dma_supported(struct device *hwdev, u64 mask)
return xen_virt_to_bus(xen_io_tlb_end - 1) <= mask;
}

-/*
- * Create userspace mapping for the DMA-coherent memory.
- * This function should be called with the pages from the current domain only,
- * passing pages mapped from other domains would lead to memory corruption.
- */
-static int
-xen_swiotlb_dma_mmap(struct device *dev, struct vm_area_struct *vma,
- void *cpu_addr, dma_addr_t dma_addr, size_t size,
- unsigned long attrs)
-{
-#ifdef CONFIG_ARM
- if (xen_get_dma_ops(dev)->mmap)
- return xen_get_dma_ops(dev)->mmap(dev, vma, cpu_addr,
- dma_addr, size, attrs);
-#endif
- return dma_common_mmap(dev, vma, cpu_addr, dma_addr, size, attrs);
-}
-
-/*
- * This function should be called with the pages from the current domain only,
- * passing pages mapped from other domains would lead to memory corruption.
- */
-static int
-xen_swiotlb_get_sgtable(struct device *dev, struct sg_table *sgt,
- void *cpu_addr, dma_addr_t handle, size_t size,
- unsigned long attrs)
-{
-#ifdef CONFIG_ARM
- if (xen_get_dma_ops(dev)->get_sgtable) {
-#if 0
- /*
- * This check verifies that the page belongs to the current domain and
- * is not one mapped from another domain.
- * This check is for debug only, and should not go to production build
- */
- unsigned long bfn = PHYS_PFN(dma_to_phys(dev, handle));
- BUG_ON (!page_is_ram(bfn));
-#endif
- return xen_get_dma_ops(dev)->get_sgtable(dev, sgt, cpu_addr,
- handle, size, attrs);
- }
-#endif
- return dma_common_get_sgtable(dev, sgt, cpu_addr, handle, size, attrs);
-}
-
const struct dma_map_ops xen_swiotlb_dma_ops = {
.alloc = xen_swiotlb_alloc_coherent,
.free = xen_swiotlb_free_coherent,
@@ -604,6 +559,6 @@ const struct dma_map_ops xen_swiotlb_dma_ops = {
.map_page = xen_swiotlb_map_page,
.unmap_page = xen_swiotlb_unmap_page,
.dma_supported = xen_swiotlb_dma_supported,
- .mmap = xen_swiotlb_dma_mmap,
- .get_sgtable = xen_swiotlb_get_sgtable,
+ .mmap = dma_common_mmap,
+ .get_sgtable = dma_common_get_sgtable,
};
diff --git a/include/xen/arm/page-coherent.h b/include/xen/arm/page-coherent.h
index 2ca9164a79bf..da2cc09c8eda 100644
--- a/include/xen/arm/page-coherent.h
+++ b/include/xen/arm/page-coherent.h
@@ -2,6 +2,9 @@
#ifndef _XEN_ARM_PAGE_COHERENT_H
#define _XEN_ARM_PAGE_COHERENT_H

+#include <linux/dma-mapping.h>
+#include <asm/page.h>
+
void __xen_dma_map_page(struct device *hwdev, struct page *page,
dma_addr_t dev_addr, unsigned long offset, size_t size,
enum dma_data_direction dir, unsigned long attrs);
@@ -13,4 +16,72 @@ void __xen_dma_sync_single_for_cpu(struct device *hwdev,
void __xen_dma_sync_single_for_device(struct device *hwdev,
dma_addr_t handle, size_t size, enum dma_data_direction dir);

+static inline void *xen_alloc_coherent_pages(struct device *hwdev, size_t size,
+ dma_addr_t *dma_handle, gfp_t flags, unsigned long attrs)
+{
+ return dma_direct_alloc(hwdev, size, dma_handle, flags, attrs);
+}
+
+static inline void xen_free_coherent_pages(struct device *hwdev, size_t size,
+ void *cpu_addr, dma_addr_t dma_handle, unsigned long attrs)
+{
+ dma_direct_free(hwdev, size, cpu_addr, dma_handle, attrs);
+}
+
+static inline void xen_dma_sync_single_for_cpu(struct device *hwdev,
+ dma_addr_t handle, size_t size, enum dma_data_direction dir)
+{
+ unsigned long pfn = PFN_DOWN(handle);
+
+ if (pfn_valid(pfn))
+ dma_direct_sync_single_for_cpu(hwdev, handle, size, dir);
+ else
+ __xen_dma_sync_single_for_cpu(hwdev, handle, size, dir);
+}
+
+static inline void xen_dma_sync_single_for_device(struct device *hwdev,
+ dma_addr_t handle, size_t size, enum dma_data_direction dir)
+{
+ unsigned long pfn = PFN_DOWN(handle);
+ if (pfn_valid(pfn))
+ dma_direct_sync_single_for_device(hwdev, handle, size, dir);
+ else
+ __xen_dma_sync_single_for_device(hwdev, handle, size, dir);
+}
+
+static inline void xen_dma_map_page(struct device *hwdev, struct page *page,
+ dma_addr_t dev_addr, unsigned long offset, size_t size,
+ enum dma_data_direction dir, unsigned long attrs)
+{
+ unsigned long page_pfn = page_to_xen_pfn(page);
+ unsigned long dev_pfn = XEN_PFN_DOWN(dev_addr);
+ unsigned long compound_pages =
+ (1<<compound_order(page)) * XEN_PFN_PER_PAGE;
+ bool local = (page_pfn <= dev_pfn) &&
+ (dev_pfn - page_pfn < compound_pages);
+
+ if (local)
+ dma_direct_map_page(hwdev, page, offset, size, dir, attrs);
+ else
+ __xen_dma_map_page(hwdev, page, dev_addr, offset, size, dir, attrs);
+}
+
+static inline void xen_dma_unmap_page(struct device *hwdev, dma_addr_t handle,
+ size_t size, enum dma_data_direction dir, unsigned long attrs)
+{
+ unsigned long pfn = PFN_DOWN(handle);
+ /*
+ * Dom0 is mapped 1:1, while the Linux page can be spanned accross
+ * multiple Xen page, it's not possible to have a mix of local and
+ * foreign Xen page. Dom0 is mapped 1:1, so calling pfn_valid on a
+ * foreign mfn will always return false. If the page is local we can
+ * safely call the native dma_ops function, otherwise we call the xen
+ * specific function.
+ */
+ if (pfn_valid(pfn))
+ dma_direct_unmap_page(hwdev, handle, size, dir, attrs);
+ else
+ __xen_dma_unmap_page(hwdev, handle, size, dir, attrs);
+}
+
#endif /* _XEN_ARM_PAGE_COHERENT_H */
--
2.20.1

2019-08-16 13:03:48

by Christoph Hellwig

[permalink] [raw]
Subject: [PATCH 03/11] xen/arm: pass one less argument to dma_cache_maint

Instead of taking apart the dma address in both callers do it inside
dma_cache_maint itself.

Signed-off-by: Christoph Hellwig <[email protected]>
---
arch/arm/xen/mm.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/arch/arm/xen/mm.c b/arch/arm/xen/mm.c
index 90574d89d0d4..d9da24fda2f7 100644
--- a/arch/arm/xen/mm.c
+++ b/arch/arm/xen/mm.c
@@ -43,13 +43,15 @@ static bool hypercall_cflush = false;

/* functions called by SWIOTLB */

-static void dma_cache_maint(dma_addr_t handle, unsigned long offset,
- size_t size, enum dma_data_direction dir, enum dma_cache_op op)
+static void dma_cache_maint(dma_addr_t handle, size_t size,
+ enum dma_data_direction dir, enum dma_cache_op op)
{
struct gnttab_cache_flush cflush;
unsigned long xen_pfn;
+ unsigned long offset = handle & ~PAGE_MASK;
size_t left = size;

+ offset &= PAGE_MASK;
xen_pfn = (handle >> XEN_PAGE_SHIFT) + offset / XEN_PAGE_SIZE;
offset %= XEN_PAGE_SIZE;

@@ -86,13 +88,13 @@ static void dma_cache_maint(dma_addr_t handle, unsigned long offset,
static void __xen_dma_page_dev_to_cpu(struct device *hwdev, dma_addr_t handle,
size_t size, enum dma_data_direction dir)
{
- dma_cache_maint(handle & PAGE_MASK, handle & ~PAGE_MASK, size, dir, DMA_UNMAP);
+ dma_cache_maint(handle, size, dir, DMA_UNMAP);
}

static void __xen_dma_page_cpu_to_dev(struct device *hwdev, dma_addr_t handle,
size_t size, enum dma_data_direction dir)
{
- dma_cache_maint(handle & PAGE_MASK, handle & ~PAGE_MASK, size, dir, DMA_MAP);
+ dma_cache_maint(handle, size, dir, DMA_MAP);
}

void __xen_dma_map_page(struct device *hwdev, struct page *page,
--
2.20.1

2019-08-16 13:38:55

by Robin Murphy

[permalink] [raw]
Subject: Re: [PATCH 03/11] xen/arm: pass one less argument to dma_cache_maint

On 16/08/2019 14:00, Christoph Hellwig wrote:
> Instead of taking apart the dma address in both callers do it inside
> dma_cache_maint itself.
>
> Signed-off-by: Christoph Hellwig <[email protected]>
> ---
> arch/arm/xen/mm.c | 10 ++++++----
> 1 file changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/arch/arm/xen/mm.c b/arch/arm/xen/mm.c
> index 90574d89d0d4..d9da24fda2f7 100644
> --- a/arch/arm/xen/mm.c
> +++ b/arch/arm/xen/mm.c
> @@ -43,13 +43,15 @@ static bool hypercall_cflush = false;
>
> /* functions called by SWIOTLB */
>
> -static void dma_cache_maint(dma_addr_t handle, unsigned long offset,
> - size_t size, enum dma_data_direction dir, enum dma_cache_op op)
> +static void dma_cache_maint(dma_addr_t handle, size_t size,
> + enum dma_data_direction dir, enum dma_cache_op op)
> {
> struct gnttab_cache_flush cflush;
> unsigned long xen_pfn;
> + unsigned long offset = handle & ~PAGE_MASK;
> size_t left = size;
>
> + offset &= PAGE_MASK;

Ahem... presumably that should be handle, not offset.

Robin.

> xen_pfn = (handle >> XEN_PAGE_SHIFT) + offset / XEN_PAGE_SIZE;
> offset %= XEN_PAGE_SIZE;
>
> @@ -86,13 +88,13 @@ static void dma_cache_maint(dma_addr_t handle, unsigned long offset,
> static void __xen_dma_page_dev_to_cpu(struct device *hwdev, dma_addr_t handle,
> size_t size, enum dma_data_direction dir)
> {
> - dma_cache_maint(handle & PAGE_MASK, handle & ~PAGE_MASK, size, dir, DMA_UNMAP);
> + dma_cache_maint(handle, size, dir, DMA_UNMAP);
> }
>
> static void __xen_dma_page_cpu_to_dev(struct device *hwdev, dma_addr_t handle,
> size_t size, enum dma_data_direction dir)
> {
> - dma_cache_maint(handle & PAGE_MASK, handle & ~PAGE_MASK, size, dir, DMA_MAP);
> + dma_cache_maint(handle, size, dir, DMA_MAP);
> }
>
> void __xen_dma_map_page(struct device *hwdev, struct page *page,
>

2019-08-16 16:46:00

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 03/11] xen/arm: pass one less argument to dma_cache_maint

On Fri, Aug 16, 2019 at 02:37:58PM +0100, Robin Murphy wrote:
> On 16/08/2019 14:00, Christoph Hellwig wrote:
>> Instead of taking apart the dma address in both callers do it inside
>> dma_cache_maint itself.
>>
>> Signed-off-by: Christoph Hellwig <[email protected]>
>> ---
>> arch/arm/xen/mm.c | 10 ++++++----
>> 1 file changed, 6 insertions(+), 4 deletions(-)
>>
>> diff --git a/arch/arm/xen/mm.c b/arch/arm/xen/mm.c
>> index 90574d89d0d4..d9da24fda2f7 100644
>> --- a/arch/arm/xen/mm.c
>> +++ b/arch/arm/xen/mm.c
>> @@ -43,13 +43,15 @@ static bool hypercall_cflush = false;
>> /* functions called by SWIOTLB */
>> -static void dma_cache_maint(dma_addr_t handle, unsigned long offset,
>> - size_t size, enum dma_data_direction dir, enum dma_cache_op op)
>> +static void dma_cache_maint(dma_addr_t handle, size_t size,
>> + enum dma_data_direction dir, enum dma_cache_op op)
>> {
>> struct gnttab_cache_flush cflush;
>> unsigned long xen_pfn;
>> + unsigned long offset = handle & ~PAGE_MASK;
>> size_t left = size;
>> + offset &= PAGE_MASK;
>
> Ahem... presumably that should be handle, not offset.

Ooops, yes.

2019-08-16 22:43:13

by Julien Grall

[permalink] [raw]
Subject: Re: [Xen-devel] [PATCH 07/11] swiotlb-xen: provide a single page-coherent.h header

Hi,

On 8/16/19 2:00 PM, Christoph Hellwig wrote:
> Merge the various page-coherent.h files into a single one that either
> provides prototypes or stubs depending on the need for cache
> maintainance.
>
> For extra benefits alo include <xen/page-coherent.h> in the file
> actually implementing the interfaces provided.
>
> Signed-off-by: Christoph Hellwig <[email protected]>
> ---
> arch/arm/include/asm/xen/page-coherent.h | 2 --
> arch/arm/xen/mm.c | 1 +
> arch/arm64/include/asm/xen/page-coherent.h | 2 --
> arch/x86/include/asm/xen/page-coherent.h | 22 ------------------
> drivers/xen/swiotlb-xen.c | 4 +---
> include/Kbuild | 2 +-
> include/xen/{arm => }/page-coherent.h | 27 +++++++++++++++++++---

I am not sure I agree with this rename. The implementation of the
helpers are very Arm specific as this is assuming Dom0 is 1:1 mapped.

This was necessary due to the lack of IOMMU on Arm platforms back then.
But this is now a pain to get rid of it on newer platform...

Cheers,

--
Julien Grall

2019-08-17 06:51:29

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [Xen-devel] [PATCH 07/11] swiotlb-xen: provide a single page-coherent.h header

On Fri, Aug 16, 2019 at 11:40:43PM +0100, Julien Grall wrote:
> I am not sure I agree with this rename. The implementation of the helpers
> are very Arm specific as this is assuming Dom0 is 1:1 mapped.
>
> This was necessary due to the lack of IOMMU on Arm platforms back then.
> But this is now a pain to get rid of it on newer platform...

So if you look at the final version of the header after the whole
series, what assumes a 1:1 mapping? It all just is

if (pfn_valid())
local cache sync;
else
call into the arch code;

are you concerned that the local cache sync might have to be split
up more for a non-1:1 map in that case? We could just move
the xen_dma_* routines into the arch instead of __xen_dma, but it
really helps to have a common interface header.

2019-08-17 18:26:57

by Julien Grall

[permalink] [raw]
Subject: Re: [Xen-devel] [PATCH 07/11] swiotlb-xen: provide a single page-coherent.h header

Hi Christoph,

On 8/17/19 7:50 AM, Christoph Hellwig wrote:
> On Fri, Aug 16, 2019 at 11:40:43PM +0100, Julien Grall wrote:
>> I am not sure I agree with this rename. The implementation of the helpers
>> are very Arm specific as this is assuming Dom0 is 1:1 mapped.
>>
>> This was necessary due to the lack of IOMMU on Arm platforms back then.
>> But this is now a pain to get rid of it on newer platform...
>
> So if you look at the final version of the header after the whole
> series, what assumes a 1:1 mapping? It all just is
>
> if (pfn_valid())
> local cache sync;
> else
> call into the arch code;

In the context of Xen Arm, the dev_addr is a host physical address. From
my understanding pfn_valid() is dealing with a guest physical frame.

Therefore by passing PFN_DOWN(dev_addr) in argument you assume that the
host and guest address spaces are the same.

>
> are you concerned that the local cache sync might have to be split
> up more for a non-1:1 map in that case? We could just movea
> the xen_dma_* routines into the arch instead of __xen_dma, but it
> really helps to have a common interface header.
Moving xen_dma_* routines into the arch would be a good option.
Although, I would still consider a stub version for arch not requiring
specific DMA.

Cheers,

--
Julien Grall

2019-08-19 11:46:34

by Julien Grall

[permalink] [raw]
Subject: Re: [Xen-devel] [PATCH 01/11] xen/arm: use dma-noncoherent.h calls for xen-swiotlb cache maintainance

Hi Christoph,

On 8/16/19 2:00 PM, Christoph Hellwig wrote:
> +static inline void xen_dma_map_page(struct device *hwdev, struct page *page,
> + dma_addr_t dev_addr, unsigned long offset, size_t size,
> + enum dma_data_direction dir, unsigned long attrs)
> +{
> + unsigned long page_pfn = page_to_xen_pfn(page);
> + unsigned long dev_pfn = XEN_PFN_DOWN(dev_addr);
> + unsigned long compound_pages =
> + (1<<compound_order(page)) * XEN_PFN_PER_PAGE;
> + bool local = (page_pfn <= dev_pfn) &&
> + (dev_pfn - page_pfn < compound_pages);
> +

The Arm version as a comment here. Could we retain it?

> + if (local)
> + dma_direct_map_page(hwdev, page, offset, size, dir, attrs);
> + else
> + __xen_dma_map_page(hwdev, page, dev_addr, offset, size, dir, attrs);
> +}
> +

Cheers,

--
Julien Grall

2019-08-19 13:55:22

by Julien Grall

[permalink] [raw]
Subject: Re: [Xen-devel] [PATCH 08/11] swiotlb-xen: use the same foreign page check everywhere

Hi Christoph,

On 8/16/19 2:00 PM, Christoph Hellwig wrote:
> xen_dma_map_page uses a different and more complicated check for
> foreign pages than the other three cache maintainance helpers.
> Switch it to the simpler pfn_vali method a well.

NIT: s/pfn_vali/pfn_valid/

>
> Signed-off-by: Christoph Hellwig <[email protected]>

Reviewed-by: Julien Grall <[email protected]>

Cheers,

--
Julien Grall

2019-08-26 10:16:07

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [Xen-devel] [PATCH 01/11] xen/arm: use dma-noncoherent.h calls for xen-swiotlb cache maintainance

On Mon, Aug 19, 2019 at 12:45:17PM +0100, Julien Grall wrote:
> On 8/16/19 2:00 PM, Christoph Hellwig wrote:
>> +static inline void xen_dma_map_page(struct device *hwdev, struct page *page,
>> + dma_addr_t dev_addr, unsigned long offset, size_t size,
>> + enum dma_data_direction dir, unsigned long attrs)
>> +{
>> + unsigned long page_pfn = page_to_xen_pfn(page);
>> + unsigned long dev_pfn = XEN_PFN_DOWN(dev_addr);
>> + unsigned long compound_pages =
>> + (1<<compound_order(page)) * XEN_PFN_PER_PAGE;
>> + bool local = (page_pfn <= dev_pfn) &&
>> + (dev_pfn - page_pfn < compound_pages);
>> +
>
> The Arm version as a comment here. Could we retain it?

I've added it in this patch, altough the rewrites later on mean it will
go away in favour of a new comment elsewhere anyway.

2019-08-27 02:02:08

by Stefano Stabellini

[permalink] [raw]
Subject: Re: swiotlb-xen cleanups

On Fri, 16 Aug 2019, Christoph Hellwig wrote:
> Hi Xen maintainers and friends,
>
> please take a look at this series that cleans up the parts of swiotlb-xen
> that deal with non-coherent caches.

Hi Christoph,

I just wanted to let you know that your series is on my radar, but I
have been swamped the last few days. I hope to get to it by the end of
the week.

Cheers,

Stefano

2019-08-27 06:23:56

by Christoph Hellwig

[permalink] [raw]
Subject: Re: swiotlb-xen cleanups

On Mon, Aug 26, 2019 at 07:00:44PM -0700, Stefano Stabellini wrote:
> On Fri, 16 Aug 2019, Christoph Hellwig wrote:
> > Hi Xen maintainers and friends,
> >
> > please take a look at this series that cleans up the parts of swiotlb-xen
> > that deal with non-coherent caches.
>
> Hi Christoph,
>
> I just wanted to let you know that your series is on my radar, but I
> have been swamped the last few days. I hope to get to it by the end of
> the week.

Thanks, and no rush. Note that I posted a v2 with a few significant
changes yesterday.