Hi all,
this patch series enables xen-swiotlb on arm and arm64.
Considering that all guests, including dom0, run on xen on arm with
second stage translation enabled, it follows that without an IOMMU no
guests could actually drive the hardware.
The solution for platforms without an IOMMU is to use swiotlb-xen,
adapted to autotranslate guests. swiotlb-xen provides a set of dma_ops
that can be used by Linux to setup a contiguous buffer in stage-2
addresses and use it for dma operations.
Basically Linux asks Xen to make a buffer contiguous and gets the
machine address for it. This buffer is going to be used by lib/swiotlb.c
to allocate bounce buffers.
Feedback is very welcome.
Cheers,
Stefano
Changes in v5:
- dropped the first two patches, already in the Xen tree;
- implement dma_mark_clean using dmac_flush_range on arm;
- add "arm64: define DMA_ERROR_CODE"
- better comment for XENMEM_exchange_and_pin return codes;
- fix xen_dma_add_entry error path;
- remove the spin_lock: the red-black tree is not modified at run time;
- add "swiotlb-xen: introduce xen_swiotlb_set_dma_mask";
- add "xen: introduce xen_alloc/free_coherent_pages";
- add "swiotlb-xen: use xen_alloc/free_coherent_pages";
- add "swiotlb: don't assume that io_tlb_start-io_tlb_end is coherent".
Changes in v4:
- rename XENMEM_get_dma_buf to XENMEM_exchange_and_pin;
- rename XENMEM_put_dma_buf to XENMEM_unpin;
- improve the documentation of the new hypercalls;
- add a note about out.address_bits for XENMEM_exchange;
- code style fixes;
- add err_out label in xen_dma_add_entry;
- remove INVALID_ADDRESS, use DMA_ERROR_CODE instead;
- add in-code comments regarding the usage of xen_dma_seg[0].dma_addr.
Changes in v3:
- add a patch to compile SWIOTLB without CONFIG_NEED_SG_DMA_LENGTH;
- add a patch to compile SWIOTLB_XEN without CONFIG_NEED_SG_DMA_LENGTH;
- arm/dma_capable: do not treat dma_mask as a limit;
- arm/dmabounce: keep using arm_dma_ops;
- add missing __init in xen_early_init declaration;
- many code style and name changes in swiotlb-xen.c;
- improve error checks in xen_dma_add_entry;
- warn on XENMEM_put_dma_buf failures.
Changes in v2:
- fixed a couple of errors in xen_bus_to_phys, xen_phys_to_bus and
xen_swiotlb_fixup.
Stefano Stabellini (13):
arm: make SWIOTLB available
arm: introduce a global dma_ops pointer
arm64: define DMA_ERROR_CODE
arm64: do not initialize arm64_swiotlb if dma_ops is already set
xen/arm,arm64: move Xen initialization earlier
xen: introduce XENMEM_exchange_and_pin and XENMEM_unpin
xen: make xen_create_contiguous_region return the dma address
swiotlb-xen: support autotranslate guests
xen/arm,arm64: enable SWIOTLB_XEN
swiotlb-xen: introduce xen_swiotlb_set_dma_mask
xen: introduce xen_alloc/free_coherent_pages
swiotlb-xen: use xen_alloc/free_coherent_pages
swiotlb: don't assume that io_tlb_start-io_tlb_end is coherent
arch/arm/Kconfig | 7 +
arch/arm/include/asm/dma-mapping.h | 35 +++++-
arch/arm/include/asm/xen/hypervisor.h | 8 +
arch/arm/include/asm/xen/page-coherent.h | 22 +++
arch/arm/include/asm/xen/page.h | 2 +
arch/arm/kernel/setup.c | 2 +
arch/arm/mm/dma-mapping.c | 3 +
arch/arm/xen/Makefile | 2 +-
arch/arm/xen/enlighten.c | 24 +++-
arch/arm/xen/mm.c | 117 ++++++++++++++++
arch/arm64/Kconfig | 1 +
arch/arm64/include/asm/dma-mapping.h | 1 +
arch/arm64/include/asm/xen/page-coherent.h | 24 +++
arch/arm64/kernel/setup.c | 2 +
arch/arm64/mm/dma-mapping.c | 2 +
arch/arm64/xen/Makefile | 2 +-
arch/ia64/include/asm/xen/page-coherent.h | 24 +++
arch/x86/include/asm/xen/page-coherent.h | 24 +++
arch/x86/xen/mmu.c | 11 +-
drivers/xen/Kconfig | 1 -
drivers/xen/swiotlb-xen.c | 209 ++++++++++++++++++++++++----
include/xen/interface/memory.h | 37 +++++
include/xen/swiotlb-xen.h | 2 +
include/xen/xen-ops.h | 7 +-
lib/swiotlb.c | 12 ++-
25 files changed, 535 insertions(+), 46 deletions(-)
git://git.kernel.org/pub/scm/linux/kernel/git/sstabellini/xen.git swiotlb-xen-5
Signed-off-by: Stefano Stabellini <[email protected]>
CC: [email protected]
CC: [email protected]
---
arch/arm64/include/asm/dma-mapping.h | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)
diff --git a/arch/arm64/include/asm/dma-mapping.h b/arch/arm64/include/asm/dma-mapping.h
index 8d18100..c2cb8a0 100644
--- a/arch/arm64/include/asm/dma-mapping.h
+++ b/arch/arm64/include/asm/dma-mapping.h
@@ -25,6 +25,7 @@
#define ARCH_HAS_DMA_GET_REQUIRED_MASK
+#define DMA_ERROR_CODE (~0)
extern struct dma_map_ops *dma_ops;
static inline struct dma_map_ops *get_dma_ops(struct device *dev)
--
1.7.2.5
Support autotranslate guests in swiotlb-xen by keeping track of the
phys-to-bus and bus-to-phys mappings of the swiotlb buffer
(xen_io_tlb_start-xen_io_tlb_end).
Use a simple direct access on a pre-allocated array for phys-to-bus
queries. Use a red-black tree for bus-to-phys queries.
Signed-off-by: Stefano Stabellini <[email protected]>
Reviewed-by: David Vrabel <[email protected]>
Changes in v5:
- fix xen_dma_add_entry error path;
- remove the spin_lock: the red-black tree is not modified at run time.
Changes in v4:
- add err_out label in xen_dma_add_entry;
- remove INVALID_ADDRESS, use DMA_ERROR_CODE instead;
- code style fixes;
- add in-code comments regarding the usage of xen_dma_seg[0].dma_addr.
Changes in v3:
- many code style and name changes;
- improve error checks in xen_dma_add_entry.
---
drivers/xen/swiotlb-xen.c | 155 ++++++++++++++++++++++++++++++++++++++++-----
1 files changed, 139 insertions(+), 16 deletions(-)
diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index b72f31c..7bb99ae 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -38,32 +38,131 @@
#include <linux/bootmem.h>
#include <linux/dma-mapping.h>
#include <linux/export.h>
+#include <linux/slab.h>
+#include <linux/rbtree.h>
#include <xen/swiotlb-xen.h>
#include <xen/page.h>
#include <xen/xen-ops.h>
#include <xen/hvc-console.h>
+#include <xen/features.h>
/*
* Used to do a quick range check in swiotlb_tbl_unmap_single and
* swiotlb_tbl_sync_single_*, to see if the memory was in fact allocated by this
* API.
*/
+#define NR_DMA_SEGS ((xen_io_tlb_nslabs + IO_TLB_SEGSIZE - 1) / IO_TLB_SEGSIZE)
static char *xen_io_tlb_start, *xen_io_tlb_end;
static unsigned long xen_io_tlb_nslabs;
/*
* Quick lookup value of the bus address of the IOTLB.
*/
-static u64 start_dma_addr;
+struct xen_dma_info {
+ dma_addr_t dma_addr;
+ phys_addr_t phys_addr;
+ size_t size;
+ struct rb_node rbnode;
+};
+
+/*
+ * This array of struct xen_dma_info is indexed by physical addresses,
+ * starting from virt_to_phys(xen_io_tlb_start). Each entry maps
+ * (IO_TLB_SEGSIZE << IO_TLB_SHIFT) bytes, except the last one that is
+ * smaller. Getting the dma address corresponding to a given physical
+ * address can be done by direct access with the right index on the
+ * array.
+ */
+static struct xen_dma_info *xen_dma_seg;
+/*
+ * This tree keeps track of bus address to physical address
+ * mappings.
+ */
+static struct rb_root bus_to_phys = RB_ROOT;
+
+static int xen_dma_add_entry(struct xen_dma_info *new)
+{
+ struct rb_node **link = &bus_to_phys.rb_node;
+ struct rb_node *parent = NULL;
+ struct xen_dma_info *entry;
+ int rc = 0;
+
+ while (*link) {
+ parent = *link;
+ entry = rb_entry(parent, struct xen_dma_info, rbnode);
+
+ if (new->dma_addr == entry->dma_addr)
+ goto err_out;
+ if (new->phys_addr == entry->phys_addr)
+ goto err_out;
+
+ if (new->dma_addr < entry->dma_addr)
+ link = &(*link)->rb_left;
+ else
+ link = &(*link)->rb_right;
+ }
+ rb_link_node(&new->rbnode, parent, link);
+ rb_insert_color(&new->rbnode, &bus_to_phys);
+ goto out;
+
+err_out:
+ rc = -EINVAL;
+ pr_warn("%s: cannot add phys=0x%pa -> dma=0x%pa: phys=0x%pa -> dma=0x%pa already exists\n",
+ __func__, &new->phys_addr, &new->dma_addr, &entry->phys_addr, &entry->dma_addr);
+out:
+ return rc;
+}
+
+static struct xen_dma_info *xen_get_dma_info(dma_addr_t dma_addr)
+{
+ struct rb_node *n = bus_to_phys.rb_node;
+ struct xen_dma_info *entry;
+
+ while (n) {
+ entry = rb_entry(n, struct xen_dma_info, rbnode);
+ if (entry->dma_addr <= dma_addr &&
+ entry->dma_addr + entry->size > dma_addr) {
+ return entry;
+ }
+ if (dma_addr < entry->dma_addr)
+ n = n->rb_left;
+ else
+ n = n->rb_right;
+ }
+
+ return NULL;
+}
static dma_addr_t xen_phys_to_bus(phys_addr_t paddr)
{
- return phys_to_machine(XPADDR(paddr)).maddr;
+ int nr_seg;
+ unsigned long offset;
+ char *vaddr;
+
+ if (!xen_feature(XENFEAT_auto_translated_physmap))
+ return phys_to_machine(XPADDR(paddr)).maddr;
+
+ vaddr = (char *)phys_to_virt(paddr);
+ if (vaddr >= xen_io_tlb_end || vaddr < xen_io_tlb_start)
+ return DMA_ERROR_CODE;
+
+ offset = vaddr - xen_io_tlb_start;
+ nr_seg = offset / (IO_TLB_SEGSIZE << IO_TLB_SHIFT);
+
+ return xen_dma_seg[nr_seg].dma_addr +
+ (paddr - xen_dma_seg[nr_seg].phys_addr);
}
static phys_addr_t xen_bus_to_phys(dma_addr_t baddr)
{
- return machine_to_phys(XMADDR(baddr)).paddr;
+ if (xen_feature(XENFEAT_auto_translated_physmap)) {
+ struct xen_dma_info *dma = xen_get_dma_info(baddr);
+ if (dma == NULL)
+ return DMA_ERROR_CODE;
+ else
+ return dma->phys_addr + (baddr - dma->dma_addr);
+ } else
+ return machine_to_phys(XMADDR(baddr)).paddr;
}
static dma_addr_t xen_virt_to_bus(void *address)
@@ -107,6 +206,9 @@ static int is_xen_swiotlb_buffer(dma_addr_t dma_addr)
unsigned long pfn = mfn_to_local_pfn(mfn);
phys_addr_t paddr;
+ if (xen_feature(XENFEAT_auto_translated_physmap))
+ return 1;
+
/* If the address is outside our domain, it CAN
* have the same virtual address as another address
* in our domain. Therefore _only_ check address within our domain.
@@ -124,13 +226,12 @@ static int max_dma_bits = 32;
static int
xen_swiotlb_fixup(void *buf, size_t size, unsigned long nslabs)
{
- int i, rc;
+ int i, j, rc;
int dma_bits;
- dma_addr_t dma_handle;
dma_bits = get_order(IO_TLB_SEGSIZE << IO_TLB_SHIFT) + PAGE_SHIFT;
- i = 0;
+ i = j = 0;
do {
int slabs = min(nslabs - i, (unsigned long)IO_TLB_SEGSIZE);
@@ -138,12 +239,18 @@ xen_swiotlb_fixup(void *buf, size_t size, unsigned long nslabs)
rc = xen_create_contiguous_region(
(unsigned long)buf + (i << IO_TLB_SHIFT),
get_order(slabs << IO_TLB_SHIFT),
- dma_bits, &dma_handle);
+ dma_bits, &xen_dma_seg[j].dma_addr);
} while (rc && dma_bits++ < max_dma_bits);
if (rc)
return rc;
+ xen_dma_seg[j].phys_addr = virt_to_phys(buf + (i << IO_TLB_SHIFT));
+ xen_dma_seg[j].size = slabs << IO_TLB_SHIFT;
+ rc = xen_dma_add_entry(&xen_dma_seg[j]);
+ if (rc != 0)
+ return rc;
i += slabs;
+ j++;
} while (i < nslabs);
return 0;
}
@@ -193,9 +300,10 @@ retry:
/*
* Get IO TLB memory from any location.
*/
- if (early)
+ if (early) {
xen_io_tlb_start = alloc_bootmem_pages(PAGE_ALIGN(bytes));
- else {
+ xen_dma_seg = alloc_bootmem(sizeof(struct xen_dma_info) * NR_DMA_SEGS);
+ } else {
#define SLABS_PER_PAGE (1 << (PAGE_SHIFT - IO_TLB_SHIFT))
#define IO_TLB_MIN_SLABS ((1<<20) >> IO_TLB_SHIFT)
while ((SLABS_PER_PAGE << order) > IO_TLB_MIN_SLABS) {
@@ -210,6 +318,8 @@ retry:
xen_io_tlb_nslabs = SLABS_PER_PAGE << order;
bytes = xen_io_tlb_nslabs << IO_TLB_SHIFT;
}
+ xen_dma_seg = kzalloc(sizeof(struct xen_dma_info) * NR_DMA_SEGS,
+ GFP_KERNEL);
}
if (!xen_io_tlb_start) {
m_ret = XEN_SWIOTLB_ENOMEM;
@@ -232,7 +342,6 @@ retry:
m_ret = XEN_SWIOTLB_EFIXUP;
goto error;
}
- start_dma_addr = xen_virt_to_bus(xen_io_tlb_start);
if (early) {
if (swiotlb_init_with_tbl(xen_io_tlb_start, xen_io_tlb_nslabs,
verbose))
@@ -290,7 +399,8 @@ xen_swiotlb_alloc_coherent(struct device *hwdev, size_t size,
phys = virt_to_phys(ret);
dev_addr = xen_phys_to_bus(phys);
- if (((dev_addr + size - 1 <= dma_mask)) &&
+ if (!xen_feature(XENFEAT_auto_translated_physmap) &&
+ ((dev_addr + size - 1 <= dma_mask)) &&
!range_straddles_page_boundary(phys, size))
*dma_handle = dev_addr;
else {
@@ -321,8 +431,9 @@ xen_swiotlb_free_coherent(struct device *hwdev, size_t size, void *vaddr,
phys = virt_to_phys(vaddr);
- if (((dev_addr + size - 1 > dma_mask)) ||
- range_straddles_page_boundary(phys, size))
+ if (xen_feature(XENFEAT_auto_translated_physmap) ||
+ (((dev_addr + size - 1 > dma_mask)) ||
+ range_straddles_page_boundary(phys, size)))
xen_destroy_contiguous_region((unsigned long)vaddr, order);
free_pages((unsigned long)vaddr, order);
@@ -351,14 +462,19 @@ dma_addr_t xen_swiotlb_map_page(struct device *dev, struct page *page,
* we can safely return the device addr and not worry about bounce
* buffering it.
*/
- if (dma_capable(dev, dev_addr, size) &&
+ if (!xen_feature(XENFEAT_auto_translated_physmap) &&
+ dma_capable(dev, dev_addr, size) &&
!range_straddles_page_boundary(phys, size) && !swiotlb_force)
return dev_addr;
/*
* Oh well, have to allocate and map a bounce buffer.
+ * Pass the dma_addr of the first slab in the iotlb buffer as
+ * argument so that swiotlb_tbl_map_single is free to allocate
+ * the bounce buffer anywhere appropriate in io_tlb_start -
+ * io_tlb_end.
*/
- map = swiotlb_tbl_map_single(dev, start_dma_addr, phys, size, dir);
+ map = swiotlb_tbl_map_single(dev, xen_dma_seg[0].dma_addr, phys, size, dir);
if (map == SWIOTLB_MAP_ERROR)
return DMA_ERROR_CODE;
@@ -494,10 +610,17 @@ xen_swiotlb_map_sg_attrs(struct device *hwdev, struct scatterlist *sgl,
dma_addr_t dev_addr = xen_phys_to_bus(paddr);
if (swiotlb_force ||
+ xen_feature(XENFEAT_auto_translated_physmap) ||
!dma_capable(hwdev, dev_addr, sg->length) ||
range_straddles_page_boundary(paddr, sg->length)) {
+ /*
+ * Pass the dma_addr of the first slab in the iotlb buffer as
+ * argument so that swiotlb_tbl_map_single is free to allocate
+ * the bounce buffer anywhere appropriate in io_tlb_start -
+ * io_tlb_end.
+ */
phys_addr_t map = swiotlb_tbl_map_single(hwdev,
- start_dma_addr,
+ xen_dma_seg[0].dma_addr,
sg_phys(sg),
sg->length,
dir);
--
1.7.2.5
Xen on arm and arm64 needs SWIOTLB_XEN: when running on Xen we need to
program the hardware with mfns rather than pfns for dma addresses.
Remove SWIOTLB_XEN dependency on X86 and PCI and make XEN select
SWIOTLB_XEN on arm and arm64.
At the moment always rely on swiotlb-xen, but when Xen starts supporting
hardware IOMMUs we'll be able to avoid it conditionally on the presence
of an IOMMU on the platform.
Implement xen_create_contiguous_region on arm and arm64 by using
XENMEM_exchange_and_pin.
Initialize the xen-swiotlb from xen_early_init (before the native
dma_ops are initialized), set dma_ops to &xen_swiotlb_dma_ops if we are
running on Xen.
Signed-off-by: Stefano Stabellini <[email protected]>
Changes in v4:
- remove redefinition of DMA_ERROR_CODE;
- update the code to use XENMEM_exchange_and_pin and XENMEM_unpin;
- add a note about hardware IOMMU in the commit message.
Changes in v3:
- code style changes;
- warn on XENMEM_put_dma_buf failures.
---
arch/arm/Kconfig | 1 +
arch/arm/include/asm/xen/page.h | 2 +
arch/arm/xen/Makefile | 2 +-
arch/arm/xen/enlighten.c | 3 +
arch/arm/xen/mm.c | 117 +++++++++++++++++++++++++++++++++++++++
arch/arm64/Kconfig | 1 +
arch/arm64/xen/Makefile | 2 +-
drivers/xen/Kconfig | 1 -
drivers/xen/swiotlb-xen.c | 16 +++++
9 files changed, 142 insertions(+), 3 deletions(-)
create mode 100644 arch/arm/xen/mm.c
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index c0bfb33..2c9d112 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -1848,6 +1848,7 @@ config XEN
depends on CPU_V7 && !CPU_V6
depends on !GENERIC_ATOMIC64
select ARM_PSCI
+ select SWIOTLB_XEN
help
Say Y if you want to run Linux in a Virtual Machine on Xen on ARM.
diff --git a/arch/arm/include/asm/xen/page.h b/arch/arm/include/asm/xen/page.h
index 359a7b5..b0f7150 100644
--- a/arch/arm/include/asm/xen/page.h
+++ b/arch/arm/include/asm/xen/page.h
@@ -6,12 +6,14 @@
#include <linux/pfn.h>
#include <linux/types.h>
+#include <linux/dma-mapping.h>
#include <xen/interface/grant_table.h>
#define pfn_to_mfn(pfn) (pfn)
#define phys_to_machine_mapping_valid(pfn) (1)
#define mfn_to_pfn(mfn) (mfn)
+#define mfn_to_local_pfn(m) (mfn_to_pfn(m))
#define mfn_to_virt(m) (__va(mfn_to_pfn(m) << PAGE_SHIFT))
#define pte_mfn pte_pfn
diff --git a/arch/arm/xen/Makefile b/arch/arm/xen/Makefile
index 4384103..66fc35d 100644
--- a/arch/arm/xen/Makefile
+++ b/arch/arm/xen/Makefile
@@ -1 +1 @@
-obj-y := enlighten.o hypercall.o grant-table.o
+obj-y := enlighten.o hypercall.o grant-table.o mm.o
diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c
index 14d17ab..06a6953 100644
--- a/arch/arm/xen/enlighten.c
+++ b/arch/arm/xen/enlighten.c
@@ -195,6 +195,7 @@ static void xen_power_off(void)
* documentation of the Xen Device Tree format.
*/
#define GRANT_TABLE_PHYSADDR 0
+extern int xen_mm_init(void);
void __init xen_early_init(void)
{
struct resource res;
@@ -230,6 +231,8 @@ void __init xen_early_init(void)
xen_start_info->flags |= SIF_INITDOMAIN|SIF_PRIVILEGED;
else
xen_start_info->flags &= ~(SIF_INITDOMAIN|SIF_PRIVILEGED);
+
+ xen_mm_init();
}
static int __init xen_guest_init(void)
diff --git a/arch/arm/xen/mm.c b/arch/arm/xen/mm.c
new file mode 100644
index 0000000..916deb2
--- /dev/null
+++ b/arch/arm/xen/mm.c
@@ -0,0 +1,117 @@
+#include <linux/bootmem.h>
+#include <linux/gfp.h>
+#include <linux/export.h>
+#include <linux/slab.h>
+#include <linux/types.h>
+#include <linux/dma-mapping.h>
+#include <linux/vmalloc.h>
+#include <linux/swiotlb.h>
+
+#include <xen/xen.h>
+#include <xen/interface/memory.h>
+#include <xen/swiotlb-xen.h>
+
+#include <asm/cacheflush.h>
+#include <asm/xen/page.h>
+#include <asm/xen/hypercall.h>
+#include <asm/xen/interface.h>
+
+static int xen_exchange_memory(xen_ulong_t extents_in,
+ unsigned int order_in,
+ xen_pfn_t *pfns_in,
+ xen_ulong_t extents_out,
+ unsigned int order_out,
+ xen_pfn_t *mfns_out,
+ unsigned int address_bits)
+{
+ long rc;
+ int success;
+
+ struct xen_memory_exchange exchange = {
+ .in = {
+ .nr_extents = extents_in,
+ .extent_order = order_in,
+ .domid = DOMID_SELF
+ },
+ .out = {
+ .nr_extents = extents_out,
+ .extent_order = order_out,
+ .address_bits = address_bits,
+ .domid = DOMID_SELF
+ }
+ };
+ set_xen_guest_handle(exchange.in.extent_start, pfns_in);
+ set_xen_guest_handle(exchange.out.extent_start, mfns_out);
+
+ BUG_ON(extents_in << order_in != extents_out << order_out);
+
+
+ rc = HYPERVISOR_memory_op(XENMEM_exchange_and_pin, &exchange);
+ success = (exchange.nr_exchanged == extents_in);
+
+ BUG_ON(!success && ((exchange.nr_exchanged != 0) || (rc == 0)));
+ BUG_ON(success && (rc != 0));
+
+ return success;
+}
+
+int xen_create_contiguous_region(unsigned long vstart, unsigned int order,
+ unsigned int address_bits,
+ dma_addr_t *dma_handle)
+{
+ phys_addr_t pstart = __pa(vstart);
+ xen_pfn_t in_frame, out_frame;
+ int success;
+
+ /* Get a new contiguous memory extent. */
+ in_frame = out_frame = pstart >> PAGE_SHIFT;
+ success = xen_exchange_memory(1, order, &in_frame,
+ 1, order, &out_frame,
+ address_bits);
+
+ if (!success)
+ return -ENOMEM;
+
+ *dma_handle = out_frame << PAGE_SHIFT;
+
+ return success ? 0 : -ENOMEM;
+}
+EXPORT_SYMBOL_GPL(xen_create_contiguous_region);
+
+void xen_destroy_contiguous_region(unsigned long vstart, unsigned int order)
+{
+ xen_pfn_t in_frame = __pa(vstart) >> PAGE_SHIFT;
+ struct xen_unpin unpin = {
+ .in = {
+ .nr_extents = 1,
+ .extent_order = order,
+ .domid = DOMID_SELF
+ },
+ };
+ set_xen_guest_handle(unpin.in.extent_start, &in_frame);
+
+ WARN_ON(HYPERVISOR_memory_op(XENMEM_unpin, &unpin));
+}
+EXPORT_SYMBOL_GPL(xen_destroy_contiguous_region);
+
+static struct dma_map_ops xen_swiotlb_dma_ops = {
+ .mapping_error = xen_swiotlb_dma_mapping_error,
+ .alloc = xen_swiotlb_alloc_coherent,
+ .free = xen_swiotlb_free_coherent,
+ .sync_single_for_cpu = xen_swiotlb_sync_single_for_cpu,
+ .sync_single_for_device = xen_swiotlb_sync_single_for_device,
+ .sync_sg_for_cpu = xen_swiotlb_sync_sg_for_cpu,
+ .sync_sg_for_device = xen_swiotlb_sync_sg_for_device,
+ .map_sg = xen_swiotlb_map_sg_attrs,
+ .unmap_sg = xen_swiotlb_unmap_sg_attrs,
+ .map_page = xen_swiotlb_map_page,
+ .unmap_page = xen_swiotlb_unmap_page,
+ .dma_supported = xen_swiotlb_dma_supported,
+};
+
+int __init xen_mm_init(void)
+{
+ xen_swiotlb_init(1, true);
+ dma_ops = &xen_swiotlb_dma_ops;
+ return 0;
+}
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 9737e97..aa1f6fb 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -209,6 +209,7 @@ config XEN_DOM0
config XEN
bool "Xen guest support on ARM64 (EXPERIMENTAL)"
depends on ARM64 && OF
+ select SWIOTLB_XEN
help
Say Y if you want to run Linux in a Virtual Machine on Xen on ARM64.
diff --git a/arch/arm64/xen/Makefile b/arch/arm64/xen/Makefile
index be24040..0ef9637 100644
--- a/arch/arm64/xen/Makefile
+++ b/arch/arm64/xen/Makefile
@@ -1,2 +1,2 @@
-xen-arm-y += $(addprefix ../../arm/xen/, enlighten.o grant-table.o)
+xen-arm-y += $(addprefix ../../arm/xen/, enlighten.o grant-table.o mm.o)
obj-y := xen-arm.o hypercall.o
diff --git a/drivers/xen/Kconfig b/drivers/xen/Kconfig
index 9e02d60..7e83688 100644
--- a/drivers/xen/Kconfig
+++ b/drivers/xen/Kconfig
@@ -140,7 +140,6 @@ config XEN_GRANT_DEV_ALLOC
config SWIOTLB_XEN
def_bool y
- depends on PCI && X86
select SWIOTLB
config XEN_TMEM
diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index 7bb99ae..b11a11f 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -45,6 +45,8 @@
#include <xen/xen-ops.h>
#include <xen/hvc-console.h>
#include <xen/features.h>
+#include <asm/dma-mapping.h>
+
/*
* Used to do a quick range check in swiotlb_tbl_unmap_single and
* swiotlb_tbl_sync_single_*, to see if the memory was in fact allocated by this
@@ -58,6 +60,20 @@ static unsigned long xen_io_tlb_nslabs;
* Quick lookup value of the bus address of the IOTLB.
*/
+#ifndef CONFIG_X86
+static unsigned long dma_alloc_coherent_mask(struct device *dev,
+ gfp_t gfp)
+{
+ unsigned long dma_mask = 0;
+
+ dma_mask = dev->coherent_dma_mask;
+ if (!dma_mask)
+ dma_mask = (gfp & GFP_DMA) ? DMA_BIT_MASK(24) : DMA_BIT_MASK(32);
+
+ return dma_mask;
+}
+#endif
+
struct xen_dma_info {
dma_addr_t dma_addr;
phys_addr_t phys_addr;
--
1.7.2.5
Implement xen_swiotlb_set_dma_mask, use it for set_dma_mask on arm.
Signed-off-by: Stefano Stabellini <[email protected]>
---
arch/arm/xen/mm.c | 1 +
drivers/xen/swiotlb-xen.c | 12 ++++++++++++
include/xen/swiotlb-xen.h | 2 ++
3 files changed, 15 insertions(+), 0 deletions(-)
diff --git a/arch/arm/xen/mm.c b/arch/arm/xen/mm.c
index 916deb2..1477ace 100644
--- a/arch/arm/xen/mm.c
+++ b/arch/arm/xen/mm.c
@@ -107,6 +107,7 @@ static struct dma_map_ops xen_swiotlb_dma_ops = {
.map_page = xen_swiotlb_map_page,
.unmap_page = xen_swiotlb_unmap_page,
.dma_supported = xen_swiotlb_dma_supported,
+ .set_dma_mask = xen_swiotlb_set_dma_mask,
};
int __init xen_mm_init(void)
diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index b11a11f..010da31 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -732,3 +732,15 @@ xen_swiotlb_dma_supported(struct device *hwdev, u64 mask)
return xen_virt_to_bus(xen_io_tlb_end - 1) <= mask;
}
EXPORT_SYMBOL_GPL(xen_swiotlb_dma_supported);
+
+int
+xen_swiotlb_set_dma_mask(struct device *dev, u64 dma_mask)
+{
+ if (!dev->dma_mask || !xen_swiotlb_dma_supported(dev, dma_mask))
+ return -EIO;
+
+ *dev->dma_mask = dma_mask;
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(xen_swiotlb_set_dma_mask);
diff --git a/include/xen/swiotlb-xen.h b/include/xen/swiotlb-xen.h
index de8bcc6..7b64465 100644
--- a/include/xen/swiotlb-xen.h
+++ b/include/xen/swiotlb-xen.h
@@ -55,4 +55,6 @@ xen_swiotlb_dma_mapping_error(struct device *hwdev, dma_addr_t dma_addr);
extern int
xen_swiotlb_dma_supported(struct device *hwdev, u64 mask);
+extern int
+xen_swiotlb_set_dma_mask(struct device *dev, u64 dma_mask);
#endif /* __LINUX_SWIOTLB_XEN_H */
--
1.7.2.5
Modify xen_create_contiguous_region to return the dma address of the
newly contiguous buffer.
Signed-off-by: Stefano Stabellini <[email protected]>
Reviewed-by: David Vrabel <[email protected]>
Changes in v4:
- use virt_to_machine instead of virt_to_bus.
---
arch/x86/xen/mmu.c | 4 +++-
drivers/xen/swiotlb-xen.c | 6 +++---
include/xen/xen-ops.h | 3 ++-
3 files changed, 8 insertions(+), 5 deletions(-)
diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c
index fdc3ba2..6c34d7c 100644
--- a/arch/x86/xen/mmu.c
+++ b/arch/x86/xen/mmu.c
@@ -2329,7 +2329,8 @@ static int xen_exchange_memory(unsigned long extents_in, unsigned int order_in,
}
int xen_create_contiguous_region(unsigned long vstart, unsigned int order,
- unsigned int address_bits)
+ unsigned int address_bits,
+ dma_addr_t *dma_handle)
{
unsigned long *in_frames = discontig_frames, out_frame;
unsigned long flags;
@@ -2368,6 +2369,7 @@ int xen_create_contiguous_region(unsigned long vstart, unsigned int order,
spin_unlock_irqrestore(&xen_reservation_lock, flags);
+ *dma_handle = virt_to_machine(vstart).maddr;
return success ? 0 : -ENOMEM;
}
EXPORT_SYMBOL_GPL(xen_create_contiguous_region);
diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index 1b2277c..b72f31c 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -126,6 +126,7 @@ xen_swiotlb_fixup(void *buf, size_t size, unsigned long nslabs)
{
int i, rc;
int dma_bits;
+ dma_addr_t dma_handle;
dma_bits = get_order(IO_TLB_SEGSIZE << IO_TLB_SHIFT) + PAGE_SHIFT;
@@ -137,7 +138,7 @@ xen_swiotlb_fixup(void *buf, size_t size, unsigned long nslabs)
rc = xen_create_contiguous_region(
(unsigned long)buf + (i << IO_TLB_SHIFT),
get_order(slabs << IO_TLB_SHIFT),
- dma_bits);
+ dma_bits, &dma_handle);
} while (rc && dma_bits++ < max_dma_bits);
if (rc)
return rc;
@@ -294,11 +295,10 @@ xen_swiotlb_alloc_coherent(struct device *hwdev, size_t size,
*dma_handle = dev_addr;
else {
if (xen_create_contiguous_region(vstart, order,
- fls64(dma_mask)) != 0) {
+ fls64(dma_mask), dma_handle) != 0) {
free_pages(vstart, order);
return NULL;
}
- *dma_handle = virt_to_machine(ret).maddr;
}
memset(ret, 0, size);
return ret;
diff --git a/include/xen/xen-ops.h b/include/xen/xen-ops.h
index d6fe062..9ef704d 100644
--- a/include/xen/xen-ops.h
+++ b/include/xen/xen-ops.h
@@ -20,7 +20,8 @@ int xen_setup_shutdown_event(void);
extern unsigned long *xen_contiguous_bitmap;
int xen_create_contiguous_region(unsigned long vstart, unsigned int order,
- unsigned int address_bits);
+ unsigned int address_bits,
+ dma_addr_t *dma_handle);
void xen_destroy_contiguous_region(unsigned long vstart, unsigned int order);
--
1.7.2.5
Initially set dma_ops to arm_dma_ops.
Signed-off-by: Stefano Stabellini <[email protected]>
Acked-by: Konrad Rzeszutek Wilk <[email protected]>
CC: [email protected]
CC: [email protected]
Changes in v3:
- keep using arm_dma_ops in dmabounce.
---
arch/arm/include/asm/dma-mapping.h | 3 ++-
arch/arm/mm/dma-mapping.c | 3 +++
2 files changed, 5 insertions(+), 1 deletions(-)
diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h
index 0982206..7d6e4f9 100644
--- a/arch/arm/include/asm/dma-mapping.h
+++ b/arch/arm/include/asm/dma-mapping.h
@@ -13,6 +13,7 @@
#include <asm/cacheflush.h>
#define DMA_ERROR_CODE (~0)
+extern struct dma_map_ops *dma_ops;
extern struct dma_map_ops arm_dma_ops;
extern struct dma_map_ops arm_coherent_dma_ops;
@@ -20,7 +21,7 @@ static inline struct dma_map_ops *get_dma_ops(struct device *dev)
{
if (dev && dev->archdata.dma_ops)
return dev->archdata.dma_ops;
- return &arm_dma_ops;
+ return dma_ops;
}
static inline void set_dma_ops(struct device *dev, struct dma_map_ops *ops)
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 7f9b179..870b12c 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -141,6 +141,9 @@ struct dma_map_ops arm_dma_ops = {
};
EXPORT_SYMBOL(arm_dma_ops);
+struct dma_map_ops *dma_ops = &arm_dma_ops;
+EXPORT_SYMBOL(dma_ops);
+
static void *arm_coherent_dma_alloc(struct device *dev, size_t size,
dma_addr_t *handle, gfp_t gfp, struct dma_attrs *attrs);
static void arm_coherent_dma_free(struct device *dev, size_t size, void *cpu_addr,
--
1.7.2.5
Move Xen initialization earlier, before any DMA requests can be made.
Signed-off-by: Stefano Stabellini <[email protected]>
Changes in v3:
- add missing __init in xen_early_init declaration.
---
arch/arm/include/asm/xen/hypervisor.h | 8 ++++++++
arch/arm/kernel/setup.c | 2 ++
arch/arm/xen/enlighten.c | 21 ++++++++++++++-------
arch/arm64/kernel/setup.c | 2 ++
4 files changed, 26 insertions(+), 7 deletions(-)
diff --git a/arch/arm/include/asm/xen/hypervisor.h b/arch/arm/include/asm/xen/hypervisor.h
index d7ab99a..2782591 100644
--- a/arch/arm/include/asm/xen/hypervisor.h
+++ b/arch/arm/include/asm/xen/hypervisor.h
@@ -1,6 +1,8 @@
#ifndef _ASM_ARM_XEN_HYPERVISOR_H
#define _ASM_ARM_XEN_HYPERVISOR_H
+#include <linux/init.h>
+
extern struct shared_info *HYPERVISOR_shared_info;
extern struct start_info *xen_start_info;
@@ -16,4 +18,10 @@ static inline enum paravirt_lazy_mode paravirt_get_lazy_mode(void)
return PARAVIRT_LAZY_NONE;
}
+#ifdef CONFIG_XEN
+void __init xen_early_init(void);
+#else
+static inline void xen_early_init(void) { return; }
+#endif
+
#endif /* _ASM_ARM_XEN_HYPERVISOR_H */
diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
index 63af9a7..cb7b8e2 100644
--- a/arch/arm/kernel/setup.c
+++ b/arch/arm/kernel/setup.c
@@ -45,6 +45,7 @@
#include <asm/cacheflush.h>
#include <asm/cachetype.h>
#include <asm/tlbflush.h>
+#include <asm/xen/hypervisor.h>
#include <asm/prom.h>
#include <asm/mach/arch.h>
@@ -889,6 +890,7 @@ void __init setup_arch(char **cmdline_p)
arm_dt_init_cpu_maps();
psci_init();
+ xen_early_init();
#ifdef CONFIG_SMP
if (is_smp()) {
if (!mdesc->smp_init || !mdesc->smp_init()) {
diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c
index c9770ba..14d17ab 100644
--- a/arch/arm/xen/enlighten.c
+++ b/arch/arm/xen/enlighten.c
@@ -195,21 +195,19 @@ static void xen_power_off(void)
* documentation of the Xen Device Tree format.
*/
#define GRANT_TABLE_PHYSADDR 0
-static int __init xen_guest_init(void)
+void __init xen_early_init(void)
{
- struct xen_add_to_physmap xatp;
- static struct shared_info *shared_info_page = 0;
+ struct resource res;
struct device_node *node;
int len;
const char *s = NULL;
const char *version = NULL;
const char *xen_prefix = "xen,xen-";
- struct resource res;
node = of_find_compatible_node(NULL, NULL, "xen,xen");
if (!node) {
pr_debug("No Xen support\n");
- return 0;
+ return;
}
s = of_get_property(node, "compatible", &len);
if (strlen(xen_prefix) + 3 < len &&
@@ -217,10 +215,10 @@ static int __init xen_guest_init(void)
version = s + strlen(xen_prefix);
if (version == NULL) {
pr_debug("Xen version not found\n");
- return 0;
+ return;
}
if (of_address_to_resource(node, GRANT_TABLE_PHYSADDR, &res))
- return 0;
+ return;
xen_hvm_resume_frames = res.start >> PAGE_SHIFT;
xen_events_irq = irq_of_parse_and_map(node, 0);
pr_info("Xen %s support found, events_irq=%d gnttab_frame_pfn=%lx\n",
@@ -232,6 +230,15 @@ static int __init xen_guest_init(void)
xen_start_info->flags |= SIF_INITDOMAIN|SIF_PRIVILEGED;
else
xen_start_info->flags &= ~(SIF_INITDOMAIN|SIF_PRIVILEGED);
+}
+
+static int __init xen_guest_init(void)
+{
+ struct xen_add_to_physmap xatp;
+ static struct shared_info *shared_info_page = 0;
+
+ if (!xen_domain())
+ return 0;
if (!shared_info_page)
shared_info_page = (struct shared_info *)
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index add6ea6..e0d438a 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -53,6 +53,7 @@
#include <asm/traps.h>
#include <asm/memblock.h>
#include <asm/psci.h>
+#include <asm/xen/hypervisor.h>
unsigned int processor_id;
EXPORT_SYMBOL(processor_id);
@@ -267,6 +268,7 @@ void __init setup_arch(char **cmdline_p)
unflatten_device_tree();
psci_init();
+ xen_early_init();
cpu_logical_map(0) = read_cpuid_mpidr() & MPIDR_HWID_BITMASK;
#ifdef CONFIG_SMP
--
1.7.2.5
XENMEM_exchange can't be used by autotranslate guests because of two
severe limitations:
- it does not copy back the mfns into the out field for autotranslate
guests;
- it does not guarantee that the hypervisor won't change the p2m
mappings for the exchanged pages while the guest is using them. Xen
never promises to keep the p2m mapping stable for autotranslate guests
in general. In practice it won't happen unless one uses uncommon
features like memory sharing or paging.
To overcome these problems I am introducing two new hypercalls.
Signed-off-by: Stefano Stabellini <[email protected]>
Changes in v5:
- better comment for XENMEM_exchange_and_pin return codes;
Changes in v4:
- rename XENMEM_get_dma_buf to XENMEM_exchange_and_pin;
- rename XENMEM_put_dma_buf to XENMEM_unpin;
- improve the documentation of the new hypercalls;
- add a note about out.address_bits for XENMEM_exchange.
---
include/xen/interface/memory.h | 37 +++++++++++++++++++++++++++++++++++++
1 files changed, 37 insertions(+), 0 deletions(-)
diff --git a/include/xen/interface/memory.h b/include/xen/interface/memory.h
index 2ecfe4f..fb0c6aa 100644
--- a/include/xen/interface/memory.h
+++ b/include/xen/interface/memory.h
@@ -66,6 +66,7 @@ struct xen_memory_exchange {
/*
* [IN] Details of memory extents to be exchanged (GMFN bases).
* Note that @in.address_bits is ignored and unused.
+ * @out.address_bits should contain the address mask for the new pages.
*/
struct xen_memory_reservation in;
@@ -263,4 +264,40 @@ struct xen_remove_from_physmap {
};
DEFINE_GUEST_HANDLE_STRUCT(xen_remove_from_physmap);
+#define XENMEM_exchange_and_pin 26
+/*
+ * This hypercall is similar to XENMEM_exchange: it takes the same
+ * struct as an argument and it exchanges the pages passed in with a new
+ * set of pages. The new pages are going to be "pinned": it's guaranteed
+ * that their p2m mapping won't be changed until explicitly "unpinned".
+ * The content of the exchanged pages is lost.
+ * Only normal guest r/w memory can be pinned: no granted pages or
+ * ballooned pages.
+ * If return code is zero then @out.extent_list provides the DMA frame
+ * numbers of the newly-allocated memory.
+ * Returns zero on complete success, otherwise a negative error code:
+ * -ENOSYS if not implemented
+ * -EINVAL if the page is already pinned
+ * -EFAULT if an internal error occurs
+ * On complete success then always @nr_exchanged == @in.nr_extents. On
+ * partial success @nr_exchanged indicates how much work was done and a
+ * negative error code is returned.
+ */
+
+#define XENMEM_unpin 27
+/*
+ * XENMEM_unpin unpins a set of pages, previously pinned by
+ * XENMEM_exchange_and_pin. After this call the p2m mapping of the pages can
+ * be transparently changed by the hypervisor, as usual. The pages are
+ * still accessible from the guest.
+ */
+struct xen_unpin {
+ /*
+ * [IN] Details of memory extents to be unpinned (GMFN bases).
+ * Note that @in.address_bits is ignored and unused.
+ */
+ struct xen_memory_reservation in;
+};
+DEFINE_GUEST_HANDLE_STRUCT(xen_unpin);
+
#endif /* __XEN_PUBLIC_MEMORY_H__ */
--
1.7.2.5
IOMMU_HELPER is needed because SWIOTLB calls iommu_is_span_boundary,
provided by lib/iommu_helper.c.
Signed-off-by: Stefano Stabellini <[email protected]>
Reviewed-by: Konrad Rzeszutek Wilk <[email protected]>
CC: [email protected]
CC: [email protected]
Changes in v5:
- implement dma_mark_clean using dmac_flush_range.
Changes in v3:
- dma_capable: do not treat dma_mask as a limit;
- remove SWIOTLB dependency on NEED_SG_DMA_LENGTH.
---
arch/arm/Kconfig | 6 ++++++
arch/arm/include/asm/dma-mapping.h | 32 ++++++++++++++++++++++++++++++++
2 files changed, 38 insertions(+), 0 deletions(-)
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index ba412e0..c0bfb33 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -1832,6 +1832,12 @@ config CC_STACKPROTECTOR
neutralized via a kernel panic.
This feature requires gcc version 4.2 or above.
+config SWIOTLB
+ def_bool y
+
+config IOMMU_HELPER
+ def_bool SWIOTLB
+
config XEN_DOM0
def_bool y
depends on XEN
diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h
index 5b579b9..0982206 100644
--- a/arch/arm/include/asm/dma-mapping.h
+++ b/arch/arm/include/asm/dma-mapping.h
@@ -10,6 +10,7 @@
#include <asm-generic/dma-coherent.h>
#include <asm/memory.h>
+#include <asm/cacheflush.h>
#define DMA_ERROR_CODE (~0)
extern struct dma_map_ops arm_dma_ops;
@@ -86,6 +87,37 @@ static inline dma_addr_t virt_to_dma(struct device *dev, void *addr)
}
#endif
+static inline dma_addr_t phys_to_dma(struct device *dev, phys_addr_t paddr)
+{
+ unsigned int offset = paddr & ~PAGE_MASK;
+ return pfn_to_dma(dev, paddr >> PAGE_SHIFT) + offset;
+}
+
+static inline phys_addr_t dma_to_phys(struct device *dev, dma_addr_t dev_addr)
+{
+ unsigned int offset = dev_addr & ~PAGE_MASK;
+ return (dma_to_pfn(dev, dev_addr) << PAGE_SHIFT) + offset;
+}
+
+static inline bool dma_capable(struct device *dev, dma_addr_t addr, size_t size)
+{
+ u64 limit, mask = *dev->dma_mask;
+
+ limit = (mask + 1) & ~mask;
+ if (limit && size > limit)
+ return 0;
+
+ if ((addr | (addr + size - 1)) & ~mask)
+ return 0;
+
+ return 1;
+}
+
+static inline void dma_mark_clean(void *addr, size_t size)
+{
+ dmac_flush_range(addr, addr + size);
+}
+
/*
* DMA errors are defined by all-bits-set in the DMA address.
*/
--
1.7.2.5
Signed-off-by: Stefano Stabellini <[email protected]>
Acked-by: Catalin Marinas <[email protected]>
CC: [email protected]
---
arch/arm64/mm/dma-mapping.c | 2 ++
1 files changed, 2 insertions(+), 0 deletions(-)
diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index 4bd7579..a006e84 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -63,6 +63,8 @@ static struct dma_map_ops arm64_swiotlb_dma_ops = {
void __init arm64_swiotlb_init(void)
{
+ if (dma_ops != NULL)
+ return;
dma_ops = &arm64_swiotlb_dma_ops;
swiotlb_init(1);
}
--
1.7.2.5
The swiotlb code has appropriate calls to dma_mark_clean in place for
buffers passed to swiotlb_map_page as an argument. However it assumes
that the swiotlb bounce buffer (io_tlb_start-io_tlb_end) is already
coherent and doesn't need any calls to dma_mark_clean.
On ARM the swiotlb bounce buffer is not coherent (the memory is
writealloc while it should be bufferable) and therefore we need to call
dma_mark_clean appropriately on the bounce buffer code paths too.
Note that most architecures have an empty dma_mark_clean implementation
anyway.
Signed-off-by: Stefano Stabellini <[email protected]>
---
lib/swiotlb.c | 12 +++++++++---
1 files changed, 9 insertions(+), 3 deletions(-)
diff --git a/lib/swiotlb.c b/lib/swiotlb.c
index 4e8686c..b1f43f6 100644
--- a/lib/swiotlb.c
+++ b/lib/swiotlb.c
@@ -547,7 +547,10 @@ void swiotlb_tbl_unmap_single(struct device *hwdev, phys_addr_t tlb_addr,
* First, sync the memory before unmapping the entry
*/
if (orig_addr && ((dir == DMA_FROM_DEVICE) || (dir == DMA_BIDIRECTIONAL)))
+ {
+ dma_mark_clean(phys_to_virt(tlb_addr), size);
swiotlb_bounce(orig_addr, tlb_addr, size, DMA_FROM_DEVICE);
+ }
/*
* Return the buffer to the free list by setting the corresponding
@@ -587,17 +590,20 @@ void swiotlb_tbl_sync_single(struct device *hwdev, phys_addr_t tlb_addr,
switch (target) {
case SYNC_FOR_CPU:
- if (likely(dir == DMA_FROM_DEVICE || dir == DMA_BIDIRECTIONAL))
+ if (likely(dir == DMA_FROM_DEVICE || dir == DMA_BIDIRECTIONAL)) {
+ dma_mark_clean(phys_to_virt(tlb_addr), size);
swiotlb_bounce(orig_addr, tlb_addr,
size, DMA_FROM_DEVICE);
+ }
else
BUG_ON(dir != DMA_TO_DEVICE);
break;
case SYNC_FOR_DEVICE:
- if (likely(dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL))
+ if (likely(dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL)) {
swiotlb_bounce(orig_addr, tlb_addr,
size, DMA_TO_DEVICE);
- else
+ dma_mark_clean(phys_to_virt(tlb_addr), size);
+ } else
BUG_ON(dir != DMA_FROM_DEVICE);
break;
default:
--
1.7.2.5
xen_swiotlb_alloc_coherent needs to allocate a coherent buffer for cpu
and devices. On native x86 and ARMv8 is sufficient to call
__get_free_pages in order to get a coherent buffer, while on ARM we need
to call arm_dma_ops.alloc.
Introduce xen_alloc_coherent_pages to abstract the arch specific buffer
allocation.
Similarly introduce xen_free_coherent_pages to free a coherent buffer:
on x86 and ARM64 is simply a call to free_pages while on ARM is
arm_dma_ops.free.
Signed-off-by: Stefano Stabellini <[email protected]>
---
arch/arm/include/asm/xen/page-coherent.h | 22 ++++++++++++++++++++++
arch/arm64/include/asm/xen/page-coherent.h | 24 ++++++++++++++++++++++++
arch/ia64/include/asm/xen/page-coherent.h | 24 ++++++++++++++++++++++++
arch/x86/include/asm/xen/page-coherent.h | 24 ++++++++++++++++++++++++
4 files changed, 94 insertions(+), 0 deletions(-)
create mode 100644 arch/arm/include/asm/xen/page-coherent.h
create mode 100644 arch/arm64/include/asm/xen/page-coherent.h
create mode 100644 arch/ia64/include/asm/xen/page-coherent.h
create mode 100644 arch/x86/include/asm/xen/page-coherent.h
diff --git a/arch/arm/include/asm/xen/page-coherent.h b/arch/arm/include/asm/xen/page-coherent.h
new file mode 100644
index 0000000..af2cf8d
--- /dev/null
+++ b/arch/arm/include/asm/xen/page-coherent.h
@@ -0,0 +1,22 @@
+#ifndef _ASM_ARM_XEN_PAGE_COHERENT_H
+#define _ASM_ARM_XEN_PAGE_COHERENT_H
+
+#include <asm/page.h>
+#include <linux/dma-attrs.h>
+#include <linux/dma-mapping.h>
+
+static inline void *xen_alloc_coherent_pages(struct device *hwdev, size_t size,
+ dma_addr_t *dma_handle, gfp_t flags,
+ struct dma_attrs *attrs)
+{
+ return arm_dma_ops.alloc(hwdev, size, dma_handle, flags, attrs);
+}
+
+static inline void xen_free_coherent_pages(struct device *hwdev, size_t size,
+ void *cpu_addr, dma_addr_t dma_handle,
+ struct dma_attrs *attrs)
+{
+ return arm_dma_ops.free(hwdev, size, cpu_addr, dma_handle, attrs);
+}
+
+#endif /* _ASM_ARM_XEN_PAGE_COHERENT_H */
diff --git a/arch/arm64/include/asm/xen/page-coherent.h b/arch/arm64/include/asm/xen/page-coherent.h
new file mode 100644
index 0000000..0d6ad25
--- /dev/null
+++ b/arch/arm64/include/asm/xen/page-coherent.h
@@ -0,0 +1,24 @@
+#ifndef _ASM_ARM64_XEN_PAGE_COHERENT_H
+#define _ASM_ARM64_XEN_PAGE_COHERENT_H
+
+#include <asm/page.h>
+#include <linux/dma-attrs.h>
+#include <linux/dma-mapping.h>
+
+static inline void *xen_alloc_coherent_pages(struct device *hwdev, size_t size,
+ dma_addr_t *dma_handle, gfp_t flags,
+ struct dma_attrs *attrs)
+{
+ void *vstart = (void*)__get_free_pages(flags, get_order(size));
+ *dma_handle = virt_to_phys(vstart);
+ return vstart;
+}
+
+static inline void xen_free_coherent_pages(struct device *hwdev, size_t size,
+ void *cpu_addr, dma_addr_t dma_handle,
+ struct dma_attrs *attrs)
+{
+ free_pages((unsigned long) cpu_addr, get_order(size));
+}
+
+#endif /* _ASM_ARM64_XEN_PAGE_COHERENT_H */
diff --git a/arch/ia64/include/asm/xen/page-coherent.h b/arch/ia64/include/asm/xen/page-coherent.h
new file mode 100644
index 0000000..37b929c
--- /dev/null
+++ b/arch/ia64/include/asm/xen/page-coherent.h
@@ -0,0 +1,24 @@
+#ifndef _ASM_IA64_XEN_PAGE_COHERENT_H
+#define _ASM_IA64_XEN_PAGE_COHERENT_H
+
+#include <asm/page.h>
+#include <linux/dma-attrs.h>
+#include <linux/dma-mapping.h>
+
+static inline void *xen_alloc_coherent_pages(struct device *hwdev, size_t size,
+ dma_addr_t *dma_handle, gfp_t flags,
+ struct dma_attrs *attrs)
+{
+ void *vstart = (void*)__get_free_pages(flags, get_order(size));
+ *dma_handle = virt_to_phys(vstart);
+ return vstart;
+}
+
+static inline void xen_free_coherent_pages(struct device *hwdev, size_t size,
+ void *cpu_addr, dma_addr_t dma_handle,
+ struct dma_attrs *attrs)
+{
+ free_pages((unsigned long) cpu_addr, get_order(size));
+}
+
+#endif /* _ASM_IA64_XEN_PAGE_COHERENT_H */
diff --git a/arch/x86/include/asm/xen/page-coherent.h b/arch/x86/include/asm/xen/page-coherent.h
new file mode 100644
index 0000000..31de2e0
--- /dev/null
+++ b/arch/x86/include/asm/xen/page-coherent.h
@@ -0,0 +1,24 @@
+#ifndef _ASM_X86_XEN_PAGE_COHERENT_H
+#define _ASM_X86_XEN_PAGE_COHERENT_H
+
+#include <asm/page.h>
+#include <linux/dma-attrs.h>
+#include <linux/dma-mapping.h>
+
+static inline void *xen_alloc_coherent_pages(struct device *hwdev, size_t size,
+ dma_addr_t *dma_handle, gfp_t flags,
+ struct dma_attrs *attrs)
+{
+ void *vstart = (void*)__get_free_pages(flags, get_order(size));
+ *dma_handle = virt_to_phys(vstart);
+ return vstart;
+}
+
+static inline void xen_free_coherent_pages(struct device *hwdev, size_t size,
+ void *cpu_addr, dma_addr_t dma_handle,
+ struct dma_attrs *attrs)
+{
+ free_pages((unsigned long) cpu_addr, get_order(size));
+}
+
+#endif /* _ASM_X86_XEN_PAGE_COHERENT_H */
--
1.7.2.5
Use xen_alloc_coherent_pages and xen_free_coherent_pages to allocate or
free coherent pages.
We need to be careful handling the pointer returned by
xen_alloc_coherent_pages, because on ARM the pointer is not equal to
phys_to_virt(*dma_handle). In fact virt_to_phys on the returned pointer
doesn't return a valid physical address.
Make xen_create_contiguous_region take a phys_addr_t as start parameter to
avoid the virt_to_phys calls which would be incorrect.
Signed-off-by: Stefano Stabellini <[email protected]>
---
arch/arm/xen/mm.c | 7 +++----
arch/x86/xen/mmu.c | 7 +++++--
drivers/xen/swiotlb-xen.c | 24 +++++++++++++++---------
include/xen/xen-ops.h | 4 ++--
4 files changed, 25 insertions(+), 17 deletions(-)
diff --git a/arch/arm/xen/mm.c b/arch/arm/xen/mm.c
index 1477ace..9163d4c 100644
--- a/arch/arm/xen/mm.c
+++ b/arch/arm/xen/mm.c
@@ -55,11 +55,10 @@ static int xen_exchange_memory(xen_ulong_t extents_in,
return success;
}
-int xen_create_contiguous_region(unsigned long vstart, unsigned int order,
+int xen_create_contiguous_region(phys_addr_t pstart, unsigned int order,
unsigned int address_bits,
dma_addr_t *dma_handle)
{
- phys_addr_t pstart = __pa(vstart);
xen_pfn_t in_frame, out_frame;
int success;
@@ -78,9 +77,9 @@ int xen_create_contiguous_region(unsigned long vstart, unsigned int order,
}
EXPORT_SYMBOL_GPL(xen_create_contiguous_region);
-void xen_destroy_contiguous_region(unsigned long vstart, unsigned int order)
+void xen_destroy_contiguous_region(phys_addr_t pstart, unsigned int order)
{
- xen_pfn_t in_frame = __pa(vstart) >> PAGE_SHIFT;
+ xen_pfn_t in_frame = pstart >> PAGE_SHIFT;
struct xen_unpin unpin = {
.in = {
.nr_extents = 1,
diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c
index 6c34d7c..07ee0e8 100644
--- a/arch/x86/xen/mmu.c
+++ b/arch/x86/xen/mmu.c
@@ -2328,13 +2328,14 @@ static int xen_exchange_memory(unsigned long extents_in, unsigned int order_in,
return success;
}
-int xen_create_contiguous_region(unsigned long vstart, unsigned int order,
+int xen_create_contiguous_region(phys_addr_t pstart, unsigned int order,
unsigned int address_bits,
dma_addr_t *dma_handle)
{
unsigned long *in_frames = discontig_frames, out_frame;
unsigned long flags;
int success;
+ unsigned long vstart = (unsigned long) phys_to_virt(pstart);
/*
* Currently an auto-translated guest will not perform I/O, nor will
@@ -2374,11 +2375,12 @@ int xen_create_contiguous_region(unsigned long vstart, unsigned int order,
}
EXPORT_SYMBOL_GPL(xen_create_contiguous_region);
-void xen_destroy_contiguous_region(unsigned long vstart, unsigned int order)
+void xen_destroy_contiguous_region(phys_addr_t pstart, unsigned int order)
{
unsigned long *out_frames = discontig_frames, in_frame;
unsigned long flags;
int success;
+ unsigned long vstart;
if (xen_feature(XENFEAT_auto_translated_physmap))
return;
@@ -2386,6 +2388,7 @@ void xen_destroy_contiguous_region(unsigned long vstart, unsigned int order)
if (unlikely(order > MAX_CONTIG_ORDER))
return;
+ vstart = (unsigned long) phys_to_virt(pstart);
memset((void *) vstart, 0, PAGE_SIZE << order);
spin_lock_irqsave(&xen_reservation_lock, flags);
diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index 010da31..c96b928 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -46,6 +46,7 @@
#include <xen/hvc-console.h>
#include <xen/features.h>
#include <asm/dma-mapping.h>
+#include <asm/xen/page-coherent.h>
/*
* Used to do a quick range check in swiotlb_tbl_unmap_single and
@@ -244,6 +245,7 @@ xen_swiotlb_fixup(void *buf, size_t size, unsigned long nslabs)
{
int i, j, rc;
int dma_bits;
+ phys_addr_t p = virt_to_phys(buf);
dma_bits = get_order(IO_TLB_SEGSIZE << IO_TLB_SHIFT) + PAGE_SHIFT;
@@ -253,7 +255,7 @@ xen_swiotlb_fixup(void *buf, size_t size, unsigned long nslabs)
do {
rc = xen_create_contiguous_region(
- (unsigned long)buf + (i << IO_TLB_SHIFT),
+ p + (i << IO_TLB_SHIFT),
get_order(slabs << IO_TLB_SHIFT),
dma_bits, &xen_dma_seg[j].dma_addr);
} while (rc && dma_bits++ < max_dma_bits);
@@ -389,7 +391,6 @@ xen_swiotlb_alloc_coherent(struct device *hwdev, size_t size,
void *ret;
int order = get_order(size);
u64 dma_mask = DMA_BIT_MASK(32);
- unsigned long vstart;
phys_addr_t phys;
dma_addr_t dev_addr;
@@ -404,8 +405,7 @@ xen_swiotlb_alloc_coherent(struct device *hwdev, size_t size,
if (dma_alloc_from_coherent(hwdev, size, dma_handle, &ret))
return ret;
- vstart = __get_free_pages(flags, order);
- ret = (void *)vstart;
+ ret = xen_alloc_coherent_pages(hwdev, size, dma_handle, flags, attrs);
if (!ret)
return ret;
@@ -413,16 +413,20 @@ xen_swiotlb_alloc_coherent(struct device *hwdev, size_t size,
if (hwdev && hwdev->coherent_dma_mask)
dma_mask = dma_alloc_coherent_mask(hwdev, flags);
- phys = virt_to_phys(ret);
+ /* At this point dma_handle is the physical address, next we are
+ * going to set it to the machine address.
+ * Do not use virt_to_phys(ret) because on ARM it doesn't correspond
+ * to *dma_handle. */
+ phys = *dma_handle;
dev_addr = xen_phys_to_bus(phys);
if (!xen_feature(XENFEAT_auto_translated_physmap) &&
((dev_addr + size - 1 <= dma_mask)) &&
!range_straddles_page_boundary(phys, size))
*dma_handle = dev_addr;
else {
- if (xen_create_contiguous_region(vstart, order,
+ if (xen_create_contiguous_region(phys, order,
fls64(dma_mask), dma_handle) != 0) {
- free_pages(vstart, order);
+ xen_free_coherent_pages(hwdev, size, ret, (dma_addr_t)phys, attrs);
return NULL;
}
}
@@ -445,14 +449,16 @@ xen_swiotlb_free_coherent(struct device *hwdev, size_t size, void *vaddr,
if (hwdev && hwdev->coherent_dma_mask)
dma_mask = hwdev->coherent_dma_mask;
- phys = virt_to_phys(vaddr);
+ /* do not use virt_to_phys because on ARM it doesn't return you the
+ * physical address */
+ phys = xen_bus_to_phys(dev_addr);
if (xen_feature(XENFEAT_auto_translated_physmap) ||
(((dev_addr + size - 1 > dma_mask)) ||
range_straddles_page_boundary(phys, size)))
xen_destroy_contiguous_region((unsigned long)vaddr, order);
- free_pages((unsigned long)vaddr, order);
+ xen_free_coherent_pages(hwdev, size, vaddr, (dma_addr_t)phys, attrs);
}
EXPORT_SYMBOL_GPL(xen_swiotlb_free_coherent);
diff --git a/include/xen/xen-ops.h b/include/xen/xen-ops.h
index 9ef704d..fb2ea8f 100644
--- a/include/xen/xen-ops.h
+++ b/include/xen/xen-ops.h
@@ -19,11 +19,11 @@ void xen_arch_resume(void);
int xen_setup_shutdown_event(void);
extern unsigned long *xen_contiguous_bitmap;
-int xen_create_contiguous_region(unsigned long vstart, unsigned int order,
+int xen_create_contiguous_region(phys_addr_t pstart, unsigned int order,
unsigned int address_bits,
dma_addr_t *dma_handle);
-void xen_destroy_contiguous_region(unsigned long vstart, unsigned int order);
+void xen_destroy_contiguous_region(phys_addr_t pstart, unsigned int order);
struct vm_area_struct;
int xen_remap_domain_mfn_range(struct vm_area_struct *vma,
--
1.7.2.5
On Thu, Aug 29, 2013 at 07:32:28PM +0100, Stefano Stabellini wrote:
> Modify xen_create_contiguous_region to return the dma address of the
> newly contiguous buffer.
>
> Signed-off-by: Stefano Stabellini <[email protected]>
> Reviewed-by: David Vrabel <[email protected]>
Acked-by or Reviewed-by: Konrad Rzeszutek Wilk <[email protected]>
>
>
> Changes in v4:
> - use virt_to_machine instead of virt_to_bus.
> ---
> arch/x86/xen/mmu.c | 4 +++-
> drivers/xen/swiotlb-xen.c | 6 +++---
> include/xen/xen-ops.h | 3 ++-
> 3 files changed, 8 insertions(+), 5 deletions(-)
>
> diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c
> index fdc3ba2..6c34d7c 100644
> --- a/arch/x86/xen/mmu.c
> +++ b/arch/x86/xen/mmu.c
> @@ -2329,7 +2329,8 @@ static int xen_exchange_memory(unsigned long extents_in, unsigned int order_in,
> }
>
> int xen_create_contiguous_region(unsigned long vstart, unsigned int order,
> - unsigned int address_bits)
> + unsigned int address_bits,
> + dma_addr_t *dma_handle)
> {
> unsigned long *in_frames = discontig_frames, out_frame;
> unsigned long flags;
> @@ -2368,6 +2369,7 @@ int xen_create_contiguous_region(unsigned long vstart, unsigned int order,
>
> spin_unlock_irqrestore(&xen_reservation_lock, flags);
>
> + *dma_handle = virt_to_machine(vstart).maddr;
> return success ? 0 : -ENOMEM;
> }
> EXPORT_SYMBOL_GPL(xen_create_contiguous_region);
> diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
> index 1b2277c..b72f31c 100644
> --- a/drivers/xen/swiotlb-xen.c
> +++ b/drivers/xen/swiotlb-xen.c
> @@ -126,6 +126,7 @@ xen_swiotlb_fixup(void *buf, size_t size, unsigned long nslabs)
> {
> int i, rc;
> int dma_bits;
> + dma_addr_t dma_handle;
>
> dma_bits = get_order(IO_TLB_SEGSIZE << IO_TLB_SHIFT) + PAGE_SHIFT;
>
> @@ -137,7 +138,7 @@ xen_swiotlb_fixup(void *buf, size_t size, unsigned long nslabs)
> rc = xen_create_contiguous_region(
> (unsigned long)buf + (i << IO_TLB_SHIFT),
> get_order(slabs << IO_TLB_SHIFT),
> - dma_bits);
> + dma_bits, &dma_handle);
> } while (rc && dma_bits++ < max_dma_bits);
> if (rc)
> return rc;
> @@ -294,11 +295,10 @@ xen_swiotlb_alloc_coherent(struct device *hwdev, size_t size,
> *dma_handle = dev_addr;
> else {
> if (xen_create_contiguous_region(vstart, order,
> - fls64(dma_mask)) != 0) {
> + fls64(dma_mask), dma_handle) != 0) {
> free_pages(vstart, order);
> return NULL;
> }
> - *dma_handle = virt_to_machine(ret).maddr;
> }
> memset(ret, 0, size);
> return ret;
> diff --git a/include/xen/xen-ops.h b/include/xen/xen-ops.h
> index d6fe062..9ef704d 100644
> --- a/include/xen/xen-ops.h
> +++ b/include/xen/xen-ops.h
> @@ -20,7 +20,8 @@ int xen_setup_shutdown_event(void);
>
> extern unsigned long *xen_contiguous_bitmap;
> int xen_create_contiguous_region(unsigned long vstart, unsigned int order,
> - unsigned int address_bits);
> + unsigned int address_bits,
> + dma_addr_t *dma_handle);
>
> void xen_destroy_contiguous_region(unsigned long vstart, unsigned int order);
>
> --
> 1.7.2.5
>
On Thu, Aug 29, 2013 at 07:32:29PM +0100, Stefano Stabellini wrote:
> Support autotranslate guests in swiotlb-xen by keeping track of the
> phys-to-bus and bus-to-phys mappings of the swiotlb buffer
> (xen_io_tlb_start-xen_io_tlb_end).
>
> Use a simple direct access on a pre-allocated array for phys-to-bus
> queries. Use a red-black tree for bus-to-phys queries.
>
> Signed-off-by: Stefano Stabellini <[email protected]>
> Reviewed-by: David Vrabel <[email protected]>
Reviewed-by: Konrad Rzeszutek Wilk <[email protected]>
>
>
> Changes in v5:
> - fix xen_dma_add_entry error path;
> - remove the spin_lock: the red-black tree is not modified at run time.
>
> Changes in v4:
> - add err_out label in xen_dma_add_entry;
> - remove INVALID_ADDRESS, use DMA_ERROR_CODE instead;
> - code style fixes;
> - add in-code comments regarding the usage of xen_dma_seg[0].dma_addr.
>
> Changes in v3:
> - many code style and name changes;
> - improve error checks in xen_dma_add_entry.
> ---
> drivers/xen/swiotlb-xen.c | 155 ++++++++++++++++++++++++++++++++++++++++-----
> 1 files changed, 139 insertions(+), 16 deletions(-)
>
> diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
> index b72f31c..7bb99ae 100644
> --- a/drivers/xen/swiotlb-xen.c
> +++ b/drivers/xen/swiotlb-xen.c
> @@ -38,32 +38,131 @@
> #include <linux/bootmem.h>
> #include <linux/dma-mapping.h>
> #include <linux/export.h>
> +#include <linux/slab.h>
> +#include <linux/rbtree.h>
> #include <xen/swiotlb-xen.h>
> #include <xen/page.h>
> #include <xen/xen-ops.h>
> #include <xen/hvc-console.h>
> +#include <xen/features.h>
> /*
> * Used to do a quick range check in swiotlb_tbl_unmap_single and
> * swiotlb_tbl_sync_single_*, to see if the memory was in fact allocated by this
> * API.
> */
>
> +#define NR_DMA_SEGS ((xen_io_tlb_nslabs + IO_TLB_SEGSIZE - 1) / IO_TLB_SEGSIZE)
> static char *xen_io_tlb_start, *xen_io_tlb_end;
> static unsigned long xen_io_tlb_nslabs;
> /*
> * Quick lookup value of the bus address of the IOTLB.
> */
>
> -static u64 start_dma_addr;
> +struct xen_dma_info {
> + dma_addr_t dma_addr;
> + phys_addr_t phys_addr;
> + size_t size;
> + struct rb_node rbnode;
> +};
> +
> +/*
> + * This array of struct xen_dma_info is indexed by physical addresses,
> + * starting from virt_to_phys(xen_io_tlb_start). Each entry maps
> + * (IO_TLB_SEGSIZE << IO_TLB_SHIFT) bytes, except the last one that is
> + * smaller. Getting the dma address corresponding to a given physical
> + * address can be done by direct access with the right index on the
> + * array.
> + */
> +static struct xen_dma_info *xen_dma_seg;
> +/*
> + * This tree keeps track of bus address to physical address
> + * mappings.
> + */
> +static struct rb_root bus_to_phys = RB_ROOT;
> +
> +static int xen_dma_add_entry(struct xen_dma_info *new)
> +{
> + struct rb_node **link = &bus_to_phys.rb_node;
> + struct rb_node *parent = NULL;
> + struct xen_dma_info *entry;
> + int rc = 0;
> +
> + while (*link) {
> + parent = *link;
> + entry = rb_entry(parent, struct xen_dma_info, rbnode);
> +
> + if (new->dma_addr == entry->dma_addr)
> + goto err_out;
> + if (new->phys_addr == entry->phys_addr)
> + goto err_out;
> +
> + if (new->dma_addr < entry->dma_addr)
> + link = &(*link)->rb_left;
> + else
> + link = &(*link)->rb_right;
> + }
> + rb_link_node(&new->rbnode, parent, link);
> + rb_insert_color(&new->rbnode, &bus_to_phys);
> + goto out;
> +
> +err_out:
> + rc = -EINVAL;
> + pr_warn("%s: cannot add phys=0x%pa -> dma=0x%pa: phys=0x%pa -> dma=0x%pa already exists\n",
> + __func__, &new->phys_addr, &new->dma_addr, &entry->phys_addr, &entry->dma_addr);
> +out:
> + return rc;
> +}
> +
> +static struct xen_dma_info *xen_get_dma_info(dma_addr_t dma_addr)
> +{
> + struct rb_node *n = bus_to_phys.rb_node;
> + struct xen_dma_info *entry;
> +
> + while (n) {
> + entry = rb_entry(n, struct xen_dma_info, rbnode);
> + if (entry->dma_addr <= dma_addr &&
> + entry->dma_addr + entry->size > dma_addr) {
> + return entry;
> + }
> + if (dma_addr < entry->dma_addr)
> + n = n->rb_left;
> + else
> + n = n->rb_right;
> + }
> +
> + return NULL;
> +}
>
> static dma_addr_t xen_phys_to_bus(phys_addr_t paddr)
> {
> - return phys_to_machine(XPADDR(paddr)).maddr;
> + int nr_seg;
> + unsigned long offset;
> + char *vaddr;
> +
> + if (!xen_feature(XENFEAT_auto_translated_physmap))
> + return phys_to_machine(XPADDR(paddr)).maddr;
> +
> + vaddr = (char *)phys_to_virt(paddr);
> + if (vaddr >= xen_io_tlb_end || vaddr < xen_io_tlb_start)
> + return DMA_ERROR_CODE;
> +
> + offset = vaddr - xen_io_tlb_start;
> + nr_seg = offset / (IO_TLB_SEGSIZE << IO_TLB_SHIFT);
> +
> + return xen_dma_seg[nr_seg].dma_addr +
> + (paddr - xen_dma_seg[nr_seg].phys_addr);
> }
>
> static phys_addr_t xen_bus_to_phys(dma_addr_t baddr)
> {
> - return machine_to_phys(XMADDR(baddr)).paddr;
> + if (xen_feature(XENFEAT_auto_translated_physmap)) {
> + struct xen_dma_info *dma = xen_get_dma_info(baddr);
> + if (dma == NULL)
> + return DMA_ERROR_CODE;
> + else
> + return dma->phys_addr + (baddr - dma->dma_addr);
> + } else
> + return machine_to_phys(XMADDR(baddr)).paddr;
> }
>
> static dma_addr_t xen_virt_to_bus(void *address)
> @@ -107,6 +206,9 @@ static int is_xen_swiotlb_buffer(dma_addr_t dma_addr)
> unsigned long pfn = mfn_to_local_pfn(mfn);
> phys_addr_t paddr;
>
> + if (xen_feature(XENFEAT_auto_translated_physmap))
> + return 1;
> +
> /* If the address is outside our domain, it CAN
> * have the same virtual address as another address
> * in our domain. Therefore _only_ check address within our domain.
> @@ -124,13 +226,12 @@ static int max_dma_bits = 32;
> static int
> xen_swiotlb_fixup(void *buf, size_t size, unsigned long nslabs)
> {
> - int i, rc;
> + int i, j, rc;
> int dma_bits;
> - dma_addr_t dma_handle;
>
> dma_bits = get_order(IO_TLB_SEGSIZE << IO_TLB_SHIFT) + PAGE_SHIFT;
>
> - i = 0;
> + i = j = 0;
> do {
> int slabs = min(nslabs - i, (unsigned long)IO_TLB_SEGSIZE);
>
> @@ -138,12 +239,18 @@ xen_swiotlb_fixup(void *buf, size_t size, unsigned long nslabs)
> rc = xen_create_contiguous_region(
> (unsigned long)buf + (i << IO_TLB_SHIFT),
> get_order(slabs << IO_TLB_SHIFT),
> - dma_bits, &dma_handle);
> + dma_bits, &xen_dma_seg[j].dma_addr);
> } while (rc && dma_bits++ < max_dma_bits);
> if (rc)
> return rc;
>
> + xen_dma_seg[j].phys_addr = virt_to_phys(buf + (i << IO_TLB_SHIFT));
> + xen_dma_seg[j].size = slabs << IO_TLB_SHIFT;
> + rc = xen_dma_add_entry(&xen_dma_seg[j]);
> + if (rc != 0)
> + return rc;
> i += slabs;
> + j++;
> } while (i < nslabs);
> return 0;
> }
> @@ -193,9 +300,10 @@ retry:
> /*
> * Get IO TLB memory from any location.
> */
> - if (early)
> + if (early) {
> xen_io_tlb_start = alloc_bootmem_pages(PAGE_ALIGN(bytes));
> - else {
> + xen_dma_seg = alloc_bootmem(sizeof(struct xen_dma_info) * NR_DMA_SEGS);
> + } else {
> #define SLABS_PER_PAGE (1 << (PAGE_SHIFT - IO_TLB_SHIFT))
> #define IO_TLB_MIN_SLABS ((1<<20) >> IO_TLB_SHIFT)
> while ((SLABS_PER_PAGE << order) > IO_TLB_MIN_SLABS) {
> @@ -210,6 +318,8 @@ retry:
> xen_io_tlb_nslabs = SLABS_PER_PAGE << order;
> bytes = xen_io_tlb_nslabs << IO_TLB_SHIFT;
> }
> + xen_dma_seg = kzalloc(sizeof(struct xen_dma_info) * NR_DMA_SEGS,
> + GFP_KERNEL);
> }
> if (!xen_io_tlb_start) {
> m_ret = XEN_SWIOTLB_ENOMEM;
> @@ -232,7 +342,6 @@ retry:
> m_ret = XEN_SWIOTLB_EFIXUP;
> goto error;
> }
> - start_dma_addr = xen_virt_to_bus(xen_io_tlb_start);
> if (early) {
> if (swiotlb_init_with_tbl(xen_io_tlb_start, xen_io_tlb_nslabs,
> verbose))
> @@ -290,7 +399,8 @@ xen_swiotlb_alloc_coherent(struct device *hwdev, size_t size,
>
> phys = virt_to_phys(ret);
> dev_addr = xen_phys_to_bus(phys);
> - if (((dev_addr + size - 1 <= dma_mask)) &&
> + if (!xen_feature(XENFEAT_auto_translated_physmap) &&
> + ((dev_addr + size - 1 <= dma_mask)) &&
> !range_straddles_page_boundary(phys, size))
> *dma_handle = dev_addr;
> else {
> @@ -321,8 +431,9 @@ xen_swiotlb_free_coherent(struct device *hwdev, size_t size, void *vaddr,
>
> phys = virt_to_phys(vaddr);
>
> - if (((dev_addr + size - 1 > dma_mask)) ||
> - range_straddles_page_boundary(phys, size))
> + if (xen_feature(XENFEAT_auto_translated_physmap) ||
> + (((dev_addr + size - 1 > dma_mask)) ||
> + range_straddles_page_boundary(phys, size)))
> xen_destroy_contiguous_region((unsigned long)vaddr, order);
>
> free_pages((unsigned long)vaddr, order);
> @@ -351,14 +462,19 @@ dma_addr_t xen_swiotlb_map_page(struct device *dev, struct page *page,
> * we can safely return the device addr and not worry about bounce
> * buffering it.
> */
> - if (dma_capable(dev, dev_addr, size) &&
> + if (!xen_feature(XENFEAT_auto_translated_physmap) &&
> + dma_capable(dev, dev_addr, size) &&
> !range_straddles_page_boundary(phys, size) && !swiotlb_force)
> return dev_addr;
>
> /*
> * Oh well, have to allocate and map a bounce buffer.
> + * Pass the dma_addr of the first slab in the iotlb buffer as
> + * argument so that swiotlb_tbl_map_single is free to allocate
> + * the bounce buffer anywhere appropriate in io_tlb_start -
> + * io_tlb_end.
> */
> - map = swiotlb_tbl_map_single(dev, start_dma_addr, phys, size, dir);
> + map = swiotlb_tbl_map_single(dev, xen_dma_seg[0].dma_addr, phys, size, dir);
> if (map == SWIOTLB_MAP_ERROR)
> return DMA_ERROR_CODE;
>
> @@ -494,10 +610,17 @@ xen_swiotlb_map_sg_attrs(struct device *hwdev, struct scatterlist *sgl,
> dma_addr_t dev_addr = xen_phys_to_bus(paddr);
>
> if (swiotlb_force ||
> + xen_feature(XENFEAT_auto_translated_physmap) ||
> !dma_capable(hwdev, dev_addr, sg->length) ||
> range_straddles_page_boundary(paddr, sg->length)) {
> + /*
> + * Pass the dma_addr of the first slab in the iotlb buffer as
> + * argument so that swiotlb_tbl_map_single is free to allocate
> + * the bounce buffer anywhere appropriate in io_tlb_start -
> + * io_tlb_end.
> + */
> phys_addr_t map = swiotlb_tbl_map_single(hwdev,
> - start_dma_addr,
> + xen_dma_seg[0].dma_addr,
> sg_phys(sg),
> sg->length,
> dir);
> --
> 1.7.2.5
>
On Thu, Aug 29, 2013 at 07:32:33PM +0100, Stefano Stabellini wrote:
> Use xen_alloc_coherent_pages and xen_free_coherent_pages to allocate or
> free coherent pages.
>
> We need to be careful handling the pointer returned by
> xen_alloc_coherent_pages, because on ARM the pointer is not equal to
> phys_to_virt(*dma_handle). In fact virt_to_phys on the returned pointer
> doesn't return a valid physical address.
Why is it called 'virt_to_phys'? What does it return then?
>
> Make xen_create_contiguous_region take a phys_addr_t as start parameter to
> avoid the virt_to_phys calls which would be incorrect.
>
> Signed-off-by: Stefano Stabellini <[email protected]>
> ---
> arch/arm/xen/mm.c | 7 +++----
> arch/x86/xen/mmu.c | 7 +++++--
> drivers/xen/swiotlb-xen.c | 24 +++++++++++++++---------
> include/xen/xen-ops.h | 4 ++--
> 4 files changed, 25 insertions(+), 17 deletions(-)
>
> diff --git a/arch/arm/xen/mm.c b/arch/arm/xen/mm.c
> index 1477ace..9163d4c 100644
> --- a/arch/arm/xen/mm.c
> +++ b/arch/arm/xen/mm.c
> @@ -55,11 +55,10 @@ static int xen_exchange_memory(xen_ulong_t extents_in,
> return success;
> }
>
> -int xen_create_contiguous_region(unsigned long vstart, unsigned int order,
> +int xen_create_contiguous_region(phys_addr_t pstart, unsigned int order,
> unsigned int address_bits,
> dma_addr_t *dma_handle)
> {
> - phys_addr_t pstart = __pa(vstart);
> xen_pfn_t in_frame, out_frame;
> int success;
>
> @@ -78,9 +77,9 @@ int xen_create_contiguous_region(unsigned long vstart, unsigned int order,
> }
> EXPORT_SYMBOL_GPL(xen_create_contiguous_region);
>
> -void xen_destroy_contiguous_region(unsigned long vstart, unsigned int order)
> +void xen_destroy_contiguous_region(phys_addr_t pstart, unsigned int order)
> {
> - xen_pfn_t in_frame = __pa(vstart) >> PAGE_SHIFT;
> + xen_pfn_t in_frame = pstart >> PAGE_SHIFT;
> struct xen_unpin unpin = {
> .in = {
> .nr_extents = 1,
> diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c
> index 6c34d7c..07ee0e8 100644
> --- a/arch/x86/xen/mmu.c
> +++ b/arch/x86/xen/mmu.c
> @@ -2328,13 +2328,14 @@ static int xen_exchange_memory(unsigned long extents_in, unsigned int order_in,
> return success;
> }
>
> -int xen_create_contiguous_region(unsigned long vstart, unsigned int order,
> +int xen_create_contiguous_region(phys_addr_t pstart, unsigned int order,
> unsigned int address_bits,
> dma_addr_t *dma_handle)
> {
> unsigned long *in_frames = discontig_frames, out_frame;
> unsigned long flags;
> int success;
> + unsigned long vstart = (unsigned long) phys_to_virt(pstart);
Extra space.
>
> /*
> * Currently an auto-translated guest will not perform I/O, nor will
> @@ -2374,11 +2375,12 @@ int xen_create_contiguous_region(unsigned long vstart, unsigned int order,
> }
> EXPORT_SYMBOL_GPL(xen_create_contiguous_region);
>
> -void xen_destroy_contiguous_region(unsigned long vstart, unsigned int order)
> +void xen_destroy_contiguous_region(phys_addr_t pstart, unsigned int order)
> {
> unsigned long *out_frames = discontig_frames, in_frame;
> unsigned long flags;
> int success;
> + unsigned long vstart;
>
> if (xen_feature(XENFEAT_auto_translated_physmap))
> return;
> @@ -2386,6 +2388,7 @@ void xen_destroy_contiguous_region(unsigned long vstart, unsigned int order)
> if (unlikely(order > MAX_CONTIG_ORDER))
> return;
>
> + vstart = (unsigned long) phys_to_virt(pstart);
Ditto
> memset((void *) vstart, 0, PAGE_SIZE << order);
>
> spin_lock_irqsave(&xen_reservation_lock, flags);
> diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
> index 010da31..c96b928 100644
> --- a/drivers/xen/swiotlb-xen.c
> +++ b/drivers/xen/swiotlb-xen.c
> @@ -46,6 +46,7 @@
> #include <xen/hvc-console.h>
> #include <xen/features.h>
> #include <asm/dma-mapping.h>
> +#include <asm/xen/page-coherent.h>
>
> /*
> * Used to do a quick range check in swiotlb_tbl_unmap_single and
> @@ -244,6 +245,7 @@ xen_swiotlb_fixup(void *buf, size_t size, unsigned long nslabs)
> {
> int i, j, rc;
> int dma_bits;
> + phys_addr_t p = virt_to_phys(buf);
>
> dma_bits = get_order(IO_TLB_SEGSIZE << IO_TLB_SHIFT) + PAGE_SHIFT;
>
> @@ -253,7 +255,7 @@ xen_swiotlb_fixup(void *buf, size_t size, unsigned long nslabs)
>
> do {
> rc = xen_create_contiguous_region(
> - (unsigned long)buf + (i << IO_TLB_SHIFT),
> + p + (i << IO_TLB_SHIFT),
> get_order(slabs << IO_TLB_SHIFT),
> dma_bits, &xen_dma_seg[j].dma_addr);
> } while (rc && dma_bits++ < max_dma_bits);
> @@ -389,7 +391,6 @@ xen_swiotlb_alloc_coherent(struct device *hwdev, size_t size,
> void *ret;
> int order = get_order(size);
> u64 dma_mask = DMA_BIT_MASK(32);
> - unsigned long vstart;
> phys_addr_t phys;
> dma_addr_t dev_addr;
>
> @@ -404,8 +405,7 @@ xen_swiotlb_alloc_coherent(struct device *hwdev, size_t size,
> if (dma_alloc_from_coherent(hwdev, size, dma_handle, &ret))
> return ret;
>
> - vstart = __get_free_pages(flags, order);
> - ret = (void *)vstart;
> + ret = xen_alloc_coherent_pages(hwdev, size, dma_handle, flags, attrs);
>
> if (!ret)
> return ret;
> @@ -413,16 +413,20 @@ xen_swiotlb_alloc_coherent(struct device *hwdev, size_t size,
> if (hwdev && hwdev->coherent_dma_mask)
> dma_mask = dma_alloc_coherent_mask(hwdev, flags);
>
> - phys = virt_to_phys(ret);
> + /* At this point dma_handle is the physical address, next we are
> + * going to set it to the machine address.
> + * Do not use virt_to_phys(ret) because on ARM it doesn't correspond
> + * to *dma_handle. */
> + phys = *dma_handle;
> dev_addr = xen_phys_to_bus(phys);
> if (!xen_feature(XENFEAT_auto_translated_physmap) &&
> ((dev_addr + size - 1 <= dma_mask)) &&
> !range_straddles_page_boundary(phys, size))
> *dma_handle = dev_addr;
> else {
> - if (xen_create_contiguous_region(vstart, order,
> + if (xen_create_contiguous_region(phys, order,
> fls64(dma_mask), dma_handle) != 0) {
> - free_pages(vstart, order);
> + xen_free_coherent_pages(hwdev, size, ret, (dma_addr_t)phys, attrs);
> return NULL;
> }
> }
> @@ -445,14 +449,16 @@ xen_swiotlb_free_coherent(struct device *hwdev, size_t size, void *vaddr,
> if (hwdev && hwdev->coherent_dma_mask)
> dma_mask = hwdev->coherent_dma_mask;
>
> - phys = virt_to_phys(vaddr);
> + /* do not use virt_to_phys because on ARM it doesn't return you the
> + * physical address */
> + phys = xen_bus_to_phys(dev_addr);
>
> if (xen_feature(XENFEAT_auto_translated_physmap) ||
> (((dev_addr + size - 1 > dma_mask)) ||
> range_straddles_page_boundary(phys, size)))
> xen_destroy_contiguous_region((unsigned long)vaddr, order);
>
> - free_pages((unsigned long)vaddr, order);
> + xen_free_coherent_pages(hwdev, size, vaddr, (dma_addr_t)phys, attrs);
> }
> EXPORT_SYMBOL_GPL(xen_swiotlb_free_coherent);
>
> diff --git a/include/xen/xen-ops.h b/include/xen/xen-ops.h
> index 9ef704d..fb2ea8f 100644
> --- a/include/xen/xen-ops.h
> +++ b/include/xen/xen-ops.h
> @@ -19,11 +19,11 @@ void xen_arch_resume(void);
> int xen_setup_shutdown_event(void);
>
> extern unsigned long *xen_contiguous_bitmap;
> -int xen_create_contiguous_region(unsigned long vstart, unsigned int order,
> +int xen_create_contiguous_region(phys_addr_t pstart, unsigned int order,
> unsigned int address_bits,
> dma_addr_t *dma_handle);
>
> -void xen_destroy_contiguous_region(unsigned long vstart, unsigned int order);
> +void xen_destroy_contiguous_region(phys_addr_t pstart, unsigned int order);
>
> struct vm_area_struct;
> int xen_remap_domain_mfn_range(struct vm_area_struct *vma,
> --
> 1.7.2.5
>
On 08/29/2013 07:32 PM, Stefano Stabellini wrote:
> Initially set dma_ops to arm_dma_ops.
>
>
> Signed-off-by: Stefano Stabellini <[email protected]>
> Acked-by: Konrad Rzeszutek Wilk <[email protected]>
> CC: [email protected]
> CC: [email protected]
>
>
> Changes in v3:
> - keep using arm_dma_ops in dmabounce.
> ---
> arch/arm/include/asm/dma-mapping.h | 3 ++-
> arch/arm/mm/dma-mapping.c | 3 +++
> 2 files changed, 5 insertions(+), 1 deletions(-)
>
> diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h
> index 0982206..7d6e4f9 100644
> --- a/arch/arm/include/asm/dma-mapping.h
> +++ b/arch/arm/include/asm/dma-mapping.h
> @@ -13,6 +13,7 @@
> #include <asm/cacheflush.h>
>
> #define DMA_ERROR_CODE (~0)
> +extern struct dma_map_ops *dma_ops;
Hi,
I tried to build your swiotlb patch series for the Arndale. I have a compilation
error because dma_ops is already used in samsung sound driver (sound/soc/samsung/dma.c).
This small fix allow me to built this serie for the Arndale.
Do I need to send it separately?
=======================================================================
commit 73d4ceded87f52fa958b92d8d8d65be485e90857
Author: Julien Grall <[email protected]>
Date: Mon Sep 2 15:36:35 2013 +0100
ASoC: Samsung: Rename dma_ops by samsung_dma_ops
The commit "arm: introduce a global dma_ops pointer" introduce compilation issue
when CONFIG_SND_SOC_SAMSUNG is enabled.
sound/soc/samsung/dma.c:345:27: error: conflicting types for 'dma_ops'
/local/home/julien/works/arndale/linux/arch/arm/include/asm/dma-mapping.h:16:28:
note: previous declaration of 'dma_ops' was here
Signed-off-by: Julien Grall <[email protected]>
diff --git a/sound/soc/samsung/dma.c b/sound/soc/samsung/dma.c
index ddea134..c341603 100644
--- a/sound/soc/samsung/dma.c
+++ b/sound/soc/samsung/dma.c
@@ -342,7 +342,7 @@ static int dma_mmap(struct snd_pcm_substream *substream,
runtime->dma_bytes);
}
-static struct snd_pcm_ops dma_ops = {
+static struct snd_pcm_ops samsung_dma_ops = {
.open = dma_open,
.close = dma_close,
.ioctl = snd_pcm_lib_ioctl,
@@ -429,7 +429,7 @@ out:
}
static struct snd_soc_platform_driver samsung_asoc_platform = {
- .ops = &dma_ops,
+ .ops = &samsung_dma_ops,
.pcm_new = dma_new,
.pcm_free = dma_free_dma_buffers,
};
--
Julien Grall
On Thu, Aug 29, 2013 at 07:32:24PM +0100, Stefano Stabellini wrote:
> Signed-off-by: Stefano Stabellini <[email protected]>
> CC: [email protected]
> CC: [email protected]
> ---
> arch/arm64/include/asm/dma-mapping.h | 1 +
> 1 files changed, 1 insertions(+), 0 deletions(-)
>
> diff --git a/arch/arm64/include/asm/dma-mapping.h b/arch/arm64/include/asm/dma-mapping.h
> index 8d18100..c2cb8a0 100644
> --- a/arch/arm64/include/asm/dma-mapping.h
> +++ b/arch/arm64/include/asm/dma-mapping.h
> @@ -25,6 +25,7 @@
>
> #define ARCH_HAS_DMA_GET_REQUIRED_MASK
>
> +#define DMA_ERROR_CODE (~0)
I wonder why this isn't more generic. It looks like most (all?)
architectures define it in the same way.
--
Catalin
On Thu, Aug 29, 2013 at 07:32:32PM +0100, Stefano Stabellini wrote:
> xen_swiotlb_alloc_coherent needs to allocate a coherent buffer for cpu
> and devices. On native x86 and ARMv8 is sufficient to call
> __get_free_pages in order to get a coherent buffer, while on ARM we need
> to call arm_dma_ops.alloc.
Don't bet on this for ARMv8. It's not mandated for the architecture, so
at some point some SoC will require non-cacheable buffers for coherency.
--
Catalin
On Thu, Aug 29, 2013 at 07:32:26PM +0100, Stefano Stabellini wrote:
> Move Xen initialization earlier, before any DMA requests can be made.
>
> Signed-off-by: Stefano Stabellini <[email protected]>
I guess you should cc the corresponding maintainers here.
> arch/arm/include/asm/xen/hypervisor.h | 8 ++++++++
> arch/arm/kernel/setup.c | 2 ++
> arch/arm/xen/enlighten.c | 21 ++++++++++++++-------
> arch/arm64/kernel/setup.c | 2 ++
> 4 files changed, 26 insertions(+), 7 deletions(-)
[...]
> --- a/arch/arm64/kernel/setup.c
> +++ b/arch/arm64/kernel/setup.c
> @@ -53,6 +53,7 @@
> #include <asm/traps.h>
> #include <asm/memblock.h>
> #include <asm/psci.h>
> +#include <asm/xen/hypervisor.h>
>
> unsigned int processor_id;
> EXPORT_SYMBOL(processor_id);
> @@ -267,6 +268,7 @@ void __init setup_arch(char **cmdline_p)
> unflatten_device_tree();
>
> psci_init();
> + xen_early_init();
So Xen guests don't have any hope for single Image? Basically you set
dma_ops unconditionally in xen_early_init(), even if the kernel is not
intended to run under Xen.
--
Catalin
On Fri, 30 Aug 2013, Konrad Rzeszutek Wilk wrote:
> On Thu, Aug 29, 2013 at 07:32:33PM +0100, Stefano Stabellini wrote:
> > Use xen_alloc_coherent_pages and xen_free_coherent_pages to allocate or
> > free coherent pages.
> >
> > We need to be careful handling the pointer returned by
> > xen_alloc_coherent_pages, because on ARM the pointer is not equal to
> > phys_to_virt(*dma_handle). In fact virt_to_phys on the returned pointer
> > doesn't return a valid physical address.
>
> Why is it called 'virt_to_phys'? What does it return then?
virt_to_phys only works for kernel direct mapped RAM memory.
In this case the virtual address could be an ioremap address, therefore
passing it to virt_to_phys would give you another physical address that
doesn't correspond to it.
On Thu, 5 Sep 2013, Catalin Marinas wrote:
> On Thu, Aug 29, 2013 at 07:32:32PM +0100, Stefano Stabellini wrote:
> > xen_swiotlb_alloc_coherent needs to allocate a coherent buffer for cpu
> > and devices. On native x86 and ARMv8 is sufficient to call
> > __get_free_pages in order to get a coherent buffer, while on ARM we need
> > to call arm_dma_ops.alloc.
>
> Don't bet on this for ARMv8. It's not mandated for the architecture, so
> at some point some SoC will require non-cacheable buffers for coherency.
I see.
Would it be better if I implemented xen_alloc_coherent_pages on armv8 by
calling arm64_swiotlb_dma_ops.alloc?
On Mon, 2 Sep 2013, Julien Grall wrote:
> On 08/29/2013 07:32 PM, Stefano Stabellini wrote:
> > Initially set dma_ops to arm_dma_ops.
> >
> >
> > Signed-off-by: Stefano Stabellini <[email protected]>
> > Acked-by: Konrad Rzeszutek Wilk <[email protected]>
> > CC: [email protected]
> > CC: [email protected]
> >
> >
> > Changes in v3:
> > - keep using arm_dma_ops in dmabounce.
> > ---
> > arch/arm/include/asm/dma-mapping.h | 3 ++-
> > arch/arm/mm/dma-mapping.c | 3 +++
> > 2 files changed, 5 insertions(+), 1 deletions(-)
> >
> > diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h
> > index 0982206..7d6e4f9 100644
> > --- a/arch/arm/include/asm/dma-mapping.h
> > +++ b/arch/arm/include/asm/dma-mapping.h
> > @@ -13,6 +13,7 @@
> > #include <asm/cacheflush.h>
> >
> > #define DMA_ERROR_CODE (~0)
> > +extern struct dma_map_ops *dma_ops;
>
> Hi,
>
> I tried to build your swiotlb patch series for the Arndale. I have a compilation
> error because dma_ops is already used in samsung sound driver (sound/soc/samsung/dma.c).
>
> This small fix allow me to built this serie for the Arndale.
> Do I need to send it separately?
This looks like a good fix, I'll append it to the series
> =======================================================================
> commit 73d4ceded87f52fa958b92d8d8d65be485e90857
> Author: Julien Grall <[email protected]>
> Date: Mon Sep 2 15:36:35 2013 +0100
>
> ASoC: Samsung: Rename dma_ops by samsung_dma_ops
>
> The commit "arm: introduce a global dma_ops pointer" introduce compilation issue
> when CONFIG_SND_SOC_SAMSUNG is enabled.
>
> sound/soc/samsung/dma.c:345:27: error: conflicting types for 'dma_ops'
> /local/home/julien/works/arndale/linux/arch/arm/include/asm/dma-mapping.h:16:28:
> note: previous declaration of 'dma_ops' was here
>
> Signed-off-by: Julien Grall <[email protected]>
>
> diff --git a/sound/soc/samsung/dma.c b/sound/soc/samsung/dma.c
> index ddea134..c341603 100644
> --- a/sound/soc/samsung/dma.c
> +++ b/sound/soc/samsung/dma.c
> @@ -342,7 +342,7 @@ static int dma_mmap(struct snd_pcm_substream *substream,
> runtime->dma_bytes);
> }
>
> -static struct snd_pcm_ops dma_ops = {
> +static struct snd_pcm_ops samsung_dma_ops = {
> .open = dma_open,
> .close = dma_close,
> .ioctl = snd_pcm_lib_ioctl,
> @@ -429,7 +429,7 @@ out:
> }
>
> static struct snd_soc_platform_driver samsung_asoc_platform = {
> - .ops = &dma_ops,
> + .ops = &samsung_dma_ops,
> .pcm_new = dma_new,
> .pcm_free = dma_free_dma_buffers,
> };
>
> --
> Julien Grall
>
On Thu, 5 Sep 2013, Catalin Marinas wrote:
> On Thu, Aug 29, 2013 at 07:32:24PM +0100, Stefano Stabellini wrote:
> > Signed-off-by: Stefano Stabellini <[email protected]>
> > CC: [email protected]
> > CC: [email protected]
> > ---
> > arch/arm64/include/asm/dma-mapping.h | 1 +
> > 1 files changed, 1 insertions(+), 0 deletions(-)
> >
> > diff --git a/arch/arm64/include/asm/dma-mapping.h b/arch/arm64/include/asm/dma-mapping.h
> > index 8d18100..c2cb8a0 100644
> > --- a/arch/arm64/include/asm/dma-mapping.h
> > +++ b/arch/arm64/include/asm/dma-mapping.h
> > @@ -25,6 +25,7 @@
> >
> > #define ARCH_HAS_DMA_GET_REQUIRED_MASK
> >
> > +#define DMA_ERROR_CODE (~0)
>
> I wonder why this isn't more generic. It looks like most (all?)
> architectures define it in the same way.
Actually it's defined as 0 on x86 and ia64.
All the others define it as ~0.
On Thu, 5 Sep 2013, Catalin Marinas wrote:
> On Thu, Aug 29, 2013 at 07:32:26PM +0100, Stefano Stabellini wrote:
> > Move Xen initialization earlier, before any DMA requests can be made.
> >
> > Signed-off-by: Stefano Stabellini <[email protected]>
>
> I guess you should cc the corresponding maintainers here.
Thanks for the reminder, I'll do that.
> > arch/arm/include/asm/xen/hypervisor.h | 8 ++++++++
> > arch/arm/kernel/setup.c | 2 ++
> > arch/arm/xen/enlighten.c | 21 ++++++++++++++-------
> > arch/arm64/kernel/setup.c | 2 ++
> > 4 files changed, 26 insertions(+), 7 deletions(-)
>
> [...]
>
> > --- a/arch/arm64/kernel/setup.c
> > +++ b/arch/arm64/kernel/setup.c
> > @@ -53,6 +53,7 @@
> > #include <asm/traps.h>
> > #include <asm/memblock.h>
> > #include <asm/psci.h>
> > +#include <asm/xen/hypervisor.h>
> >
> > unsigned int processor_id;
> > EXPORT_SYMBOL(processor_id);
> > @@ -267,6 +268,7 @@ void __init setup_arch(char **cmdline_p)
> > unflatten_device_tree();
> >
> > psci_init();
> > + xen_early_init();
>
> So Xen guests don't have any hope for single Image? Basically you set
> dma_ops unconditionally in xen_early_init(), even if the kernel is not
> intended to run under Xen.
That should not happen: if we are not running on Xen xen_early_init
returns early, before calling xen_mm_init.
On Thu, 2013-09-05 at 17:59 +0100, Stefano Stabellini wrote:
> On Thu, 5 Sep 2013, Catalin Marinas wrote:
> > On Thu, Aug 29, 2013 at 07:32:26PM +0100, Stefano Stabellini wrote:
> > > Move Xen initialization earlier, before any DMA requests can be made.
> > >
> > > Signed-off-by: Stefano Stabellini <[email protected]>
> >
> > I guess you should cc the corresponding maintainers here.
>
> Thanks for the reminder, I'll do that.
>
>
> > > arch/arm/include/asm/xen/hypervisor.h | 8 ++++++++
> > > arch/arm/kernel/setup.c | 2 ++
> > > arch/arm/xen/enlighten.c | 21 ++++++++++++++-------
> > > arch/arm64/kernel/setup.c | 2 ++
> > > 4 files changed, 26 insertions(+), 7 deletions(-)
> >
> > [...]
> >
> > > --- a/arch/arm64/kernel/setup.c
> > > +++ b/arch/arm64/kernel/setup.c
> > > @@ -53,6 +53,7 @@
> > > #include <asm/traps.h>
> > > #include <asm/memblock.h>
> > > #include <asm/psci.h>
> > > +#include <asm/xen/hypervisor.h>
> > >
> > > unsigned int processor_id;
> > > EXPORT_SYMBOL(processor_id);
> > > @@ -267,6 +268,7 @@ void __init setup_arch(char **cmdline_p)
> > > unflatten_device_tree();
> > >
> > > psci_init();
> > > + xen_early_init();
> >
> > So Xen guests don't have any hope for single Image? Basically you set
> > dma_ops unconditionally in xen_early_init(), even if the kernel is not
> > intended to run under Xen.
>
> That should not happen: if we are not running on Xen xen_early_init
> returns early, before calling xen_mm_init.
x96 has a call to init_hypervisor_platform() at approximately this
location, which detects and calls the init function for any of Xen, KVM,
hyperv and vmware. I guess only Xen and KVM are currently relevant on
Linux ARM(64), so perhaps adding similar infrastructure on ARM would be
overkill at this point. I don't know if KVM needs such an early C-land
hook, I suppose it needs it even earlier so it can set up the hyp mode
trampoline from head.S?
Ian.
On Fri, Sep 06, 2013 at 09:58:59AM +0100, Ian Campbell wrote:
> On Thu, 2013-09-05 at 17:59 +0100, Stefano Stabellini wrote:
> > On Thu, 5 Sep 2013, Catalin Marinas wrote:
> > > On Thu, Aug 29, 2013 at 07:32:26PM +0100, Stefano Stabellini wrote:
> > > > Move Xen initialization earlier, before any DMA requests can be made.
> > > >
> > > > Signed-off-by: Stefano Stabellini <[email protected]>
> > >
> > > I guess you should cc the corresponding maintainers here.
> >
> > Thanks for the reminder, I'll do that.
> >
> >
> > > > arch/arm/include/asm/xen/hypervisor.h | 8 ++++++++
> > > > arch/arm/kernel/setup.c | 2 ++
> > > > arch/arm/xen/enlighten.c | 21 ++++++++++++++-------
> > > > arch/arm64/kernel/setup.c | 2 ++
> > > > 4 files changed, 26 insertions(+), 7 deletions(-)
> > >
> > > [...]
> > >
> > > > --- a/arch/arm64/kernel/setup.c
> > > > +++ b/arch/arm64/kernel/setup.c
> > > > @@ -53,6 +53,7 @@
> > > > #include <asm/traps.h>
> > > > #include <asm/memblock.h>
> > > > #include <asm/psci.h>
> > > > +#include <asm/xen/hypervisor.h>
> > > >
> > > > unsigned int processor_id;
> > > > EXPORT_SYMBOL(processor_id);
> > > > @@ -267,6 +268,7 @@ void __init setup_arch(char **cmdline_p)
> > > > unflatten_device_tree();
> > > >
> > > > psci_init();
> > > > + xen_early_init();
> > >
> > > So Xen guests don't have any hope for single Image? Basically you set
> > > dma_ops unconditionally in xen_early_init(), even if the kernel is not
> > > intended to run under Xen.
> >
> > That should not happen: if we are not running on Xen xen_early_init
> > returns early, before calling xen_mm_init.
>
> x96 has a call to init_hypervisor_platform() at approximately this
> location, which detects and calls the init function for any of Xen, KVM,
> hyperv and vmware.
I would rather have a core_initcall(xen_early_init()) if possible,
rather than hard-coded calls in setup_arch(). This early stuff is
DT-driven, so in theory you don't need a specific xen call. The only
thing is that you end up with swiotlb_init() and 64MB wasted if the Xen
guest does not plan to use them.
> I guess only Xen and KVM are currently relevant on Linux ARM(64), so
> perhaps adding similar infrastructure on ARM would be overkill at this
> point. I don't know if KVM needs such an early C-land hook, I suppose
> it needs it even earlier so it can set up the hyp mode trampoline from
> head.S?
head.S installs a Hyp stub if it starts in that mode and then forget
about. Later when KVM is initialised it installs its own code by doing
an HVC call.
--
Catalin
On Thu, Sep 05, 2013 at 05:43:33PM +0100, Stefano Stabellini wrote:
> On Thu, 5 Sep 2013, Catalin Marinas wrote:
> > On Thu, Aug 29, 2013 at 07:32:32PM +0100, Stefano Stabellini wrote:
> > > xen_swiotlb_alloc_coherent needs to allocate a coherent buffer for cpu
> > > and devices. On native x86 and ARMv8 is sufficient to call
> > > __get_free_pages in order to get a coherent buffer, while on ARM we need
> > > to call arm_dma_ops.alloc.
> >
> > Don't bet on this for ARMv8. It's not mandated for the architecture, so
> > at some point some SoC will require non-cacheable buffers for coherency.
>
> I see.
> Would it be better if I implemented xen_alloc_coherent_pages on armv8 by
> calling arm64_swiotlb_dma_ops.alloc?
What does this buffer do exactly? Is it allocated by guests?
Currently arm64_swiotlb_dma_ops assume cache-coherent DMA. I have a
patch which introduces new ops for non-coherent DMA but this should
really be orthogonal to swiotlb. You can basically have 4 combinations
of coherent/non-coherent and swiotlb/iommu. Mark Rutland is currently
looking into how best to describe this via DT as it may not even be per
SoC but per bus or device.
--
Catalin
On Thu, Sep 05, 2013 at 05:50:45PM +0100, Stefano Stabellini wrote:
> On Fri, 30 Aug 2013, Konrad Rzeszutek Wilk wrote:
> > On Thu, Aug 29, 2013 at 07:32:33PM +0100, Stefano Stabellini wrote:
> > > Use xen_alloc_coherent_pages and xen_free_coherent_pages to allocate or
> > > free coherent pages.
> > >
> > > We need to be careful handling the pointer returned by
> > > xen_alloc_coherent_pages, because on ARM the pointer is not equal to
> > > phys_to_virt(*dma_handle). In fact virt_to_phys on the returned pointer
> > > doesn't return a valid physical address.
> >
> > Why is it called 'virt_to_phys'? What does it return then?
>
> virt_to_phys only works for kernel direct mapped RAM memory.
> In this case the virtual address could be an ioremap address, therefore
> passing it to virt_to_phys would give you another physical address that
> doesn't correspond to it.
Ah, so very much ARM (and in some way SPARC) specific. I think you need add some comments
in the code (And git commit) to explain this. Thank you.
On Fri, Sep 06, 2013 at 03:09:21PM +0100, Catalin Marinas wrote:
> On Fri, Sep 06, 2013 at 09:58:59AM +0100, Ian Campbell wrote:
> > On Thu, 2013-09-05 at 17:59 +0100, Stefano Stabellini wrote:
> > > On Thu, 5 Sep 2013, Catalin Marinas wrote:
> > > > On Thu, Aug 29, 2013 at 07:32:26PM +0100, Stefano Stabellini wrote:
> > > > > Move Xen initialization earlier, before any DMA requests can be made.
> > > > >
> > > > > Signed-off-by: Stefano Stabellini <[email protected]>
> > > >
> > > > I guess you should cc the corresponding maintainers here.
> > >
> > > Thanks for the reminder, I'll do that.
> > >
> > >
> > > > > arch/arm/include/asm/xen/hypervisor.h | 8 ++++++++
> > > > > arch/arm/kernel/setup.c | 2 ++
> > > > > arch/arm/xen/enlighten.c | 21 ++++++++++++++-------
> > > > > arch/arm64/kernel/setup.c | 2 ++
> > > > > 4 files changed, 26 insertions(+), 7 deletions(-)
> > > >
> > > > [...]
> > > >
> > > > > --- a/arch/arm64/kernel/setup.c
> > > > > +++ b/arch/arm64/kernel/setup.c
> > > > > @@ -53,6 +53,7 @@
> > > > > #include <asm/traps.h>
> > > > > #include <asm/memblock.h>
> > > > > #include <asm/psci.h>
> > > > > +#include <asm/xen/hypervisor.h>
> > > > >
> > > > > unsigned int processor_id;
> > > > > EXPORT_SYMBOL(processor_id);
> > > > > @@ -267,6 +268,7 @@ void __init setup_arch(char **cmdline_p)
> > > > > unflatten_device_tree();
> > > > >
> > > > > psci_init();
> > > > > + xen_early_init();
> > > >
> > > > So Xen guests don't have any hope for single Image? Basically you set
> > > > dma_ops unconditionally in xen_early_init(), even if the kernel is not
> > > > intended to run under Xen.
> > >
> > > That should not happen: if we are not running on Xen xen_early_init
> > > returns early, before calling xen_mm_init.
> >
> > x96 has a call to init_hypervisor_platform() at approximately this
> > location, which detects and calls the init function for any of Xen, KVM,
> > hyperv and vmware.
>
> I would rather have a core_initcall(xen_early_init()) if possible,
> rather than hard-coded calls in setup_arch(). This early stuff is
> DT-driven, so in theory you don't need a specific xen call. The only
> thing is that you end up with swiotlb_init() and 64MB wasted if the Xen
> guest does not plan to use them.
There is a swiotlb_free mechanism in case the allocation was not
neccessary.
>
> > I guess only Xen and KVM are currently relevant on Linux ARM(64), so
> > perhaps adding similar infrastructure on ARM would be overkill at this
> > point. I don't know if KVM needs such an early C-land hook, I suppose
> > it needs it even earlier so it can set up the hyp mode trampoline from
> > head.S?
>
> head.S installs a Hyp stub if it starts in that mode and then forget
> about. Later when KVM is initialised it installs its own code by doing
> an HVC call.
>
> --
> Catalin
On Fri, 6 Sep 2013, Konrad Rzeszutek Wilk wrote:
> On Fri, Sep 06, 2013 at 03:09:21PM +0100, Catalin Marinas wrote:
> > On Fri, Sep 06, 2013 at 09:58:59AM +0100, Ian Campbell wrote:
> > > On Thu, 2013-09-05 at 17:59 +0100, Stefano Stabellini wrote:
> > > > On Thu, 5 Sep 2013, Catalin Marinas wrote:
> > > > > On Thu, Aug 29, 2013 at 07:32:26PM +0100, Stefano Stabellini wrote:
> > > > > > Move Xen initialization earlier, before any DMA requests can be made.
> > > > > >
> > > > > > Signed-off-by: Stefano Stabellini <[email protected]>
> > > > >
> > > > > I guess you should cc the corresponding maintainers here.
> > > >
> > > > Thanks for the reminder, I'll do that.
> > > >
> > > >
> > > > > > arch/arm/include/asm/xen/hypervisor.h | 8 ++++++++
> > > > > > arch/arm/kernel/setup.c | 2 ++
> > > > > > arch/arm/xen/enlighten.c | 21 ++++++++++++++-------
> > > > > > arch/arm64/kernel/setup.c | 2 ++
> > > > > > 4 files changed, 26 insertions(+), 7 deletions(-)
> > > > >
> > > > > [...]
> > > > >
> > > > > > --- a/arch/arm64/kernel/setup.c
> > > > > > +++ b/arch/arm64/kernel/setup.c
> > > > > > @@ -53,6 +53,7 @@
> > > > > > #include <asm/traps.h>
> > > > > > #include <asm/memblock.h>
> > > > > > #include <asm/psci.h>
> > > > > > +#include <asm/xen/hypervisor.h>
> > > > > >
> > > > > > unsigned int processor_id;
> > > > > > EXPORT_SYMBOL(processor_id);
> > > > > > @@ -267,6 +268,7 @@ void __init setup_arch(char **cmdline_p)
> > > > > > unflatten_device_tree();
> > > > > >
> > > > > > psci_init();
> > > > > > + xen_early_init();
> > > > >
> > > > > So Xen guests don't have any hope for single Image? Basically you set
> > > > > dma_ops unconditionally in xen_early_init(), even if the kernel is not
> > > > > intended to run under Xen.
> > > >
> > > > That should not happen: if we are not running on Xen xen_early_init
> > > > returns early, before calling xen_mm_init.
> > >
> > > x96 has a call to init_hypervisor_platform() at approximately this
> > > location, which detects and calls the init function for any of Xen, KVM,
> > > hyperv and vmware.
> >
> > I would rather have a core_initcall(xen_early_init()) if possible,
> > rather than hard-coded calls in setup_arch(). This early stuff is
> > DT-driven, so in theory you don't need a specific xen call. The only
> > thing is that you end up with swiotlb_init() and 64MB wasted if the Xen
> > guest does not plan to use them.
>
> There is a swiotlb_free mechanism in case the allocation was not
> neccessary.
I think I'll just use the late swiotlb initialization.
On Fri, 6 Sep 2013, Catalin Marinas wrote:
> On Thu, Sep 05, 2013 at 05:43:33PM +0100, Stefano Stabellini wrote:
> > On Thu, 5 Sep 2013, Catalin Marinas wrote:
> > > On Thu, Aug 29, 2013 at 07:32:32PM +0100, Stefano Stabellini wrote:
> > > > xen_swiotlb_alloc_coherent needs to allocate a coherent buffer for cpu
> > > > and devices. On native x86 and ARMv8 is sufficient to call
> > > > __get_free_pages in order to get a coherent buffer, while on ARM we need
> > > > to call arm_dma_ops.alloc.
> > >
> > > Don't bet on this for ARMv8. It's not mandated for the architecture, so
> > > at some point some SoC will require non-cacheable buffers for coherency.
> >
> > I see.
> > Would it be better if I implemented xen_alloc_coherent_pages on armv8 by
> > calling arm64_swiotlb_dma_ops.alloc?
>
> What does this buffer do exactly? Is it allocated by guests?
It is allocated by Dom0 to do DMA to/from a device.
It is the buffer that is going to be returned by dma_map_ops.alloc to
the caller:
On x86:
dma_map_ops.alloc -> xen_swiotlb_alloc_coherent -> xen_alloc_coherent_pages -> __get_free_pages
On ARM:
dma_map_ops.alloc -> xen_swiotlb_alloc_coherent -> xen_alloc_coherent_pages -> arm_dma_ops.alloc
On ARM64
dma_map_ops.alloc -> xen_swiotlb_alloc_coherent -> xen_alloc_coherent_pages -> ????
> Currently arm64_swiotlb_dma_ops assume cache-coherent DMA. I have a
> patch which introduces new ops for non-coherent DMA but this should
> really be orthogonal to swiotlb. You can basically have 4 combinations
> of coherent/non-coherent and swiotlb/iommu. Mark Rutland is currently
> looking into how best to describe this via DT as it may not even be per
> SoC but per bus or device.
It seems to me that calling arm64_swiotlb_dma_ops.alloc would ensure
that we allocate the right buffer for the right device?
On Fri, 6 Sep 2013, Konrad Rzeszutek Wilk wrote:
> On Thu, Sep 05, 2013 at 05:50:45PM +0100, Stefano Stabellini wrote:
> > On Fri, 30 Aug 2013, Konrad Rzeszutek Wilk wrote:
> > > On Thu, Aug 29, 2013 at 07:32:33PM +0100, Stefano Stabellini wrote:
> > > > Use xen_alloc_coherent_pages and xen_free_coherent_pages to allocate or
> > > > free coherent pages.
> > > >
> > > > We need to be careful handling the pointer returned by
> > > > xen_alloc_coherent_pages, because on ARM the pointer is not equal to
> > > > phys_to_virt(*dma_handle). In fact virt_to_phys on the returned pointer
> > > > doesn't return a valid physical address.
> > >
> > > Why is it called 'virt_to_phys'? What does it return then?
> >
> > virt_to_phys only works for kernel direct mapped RAM memory.
> > In this case the virtual address could be an ioremap address, therefore
> > passing it to virt_to_phys would give you another physical address that
> > doesn't correspond to it.
>
> Ah, so very much ARM (and in some way SPARC) specific. I think you need add some comments
> in the code (And git commit) to explain this. Thank you.
OK
On Fri, Sep 06, 2013 at 03:59:02PM +0100, Stefano Stabellini wrote:
> On Fri, 6 Sep 2013, Catalin Marinas wrote:
> > On Thu, Sep 05, 2013 at 05:43:33PM +0100, Stefano Stabellini wrote:
> > > On Thu, 5 Sep 2013, Catalin Marinas wrote:
> > > > On Thu, Aug 29, 2013 at 07:32:32PM +0100, Stefano Stabellini wrote:
> > > > > xen_swiotlb_alloc_coherent needs to allocate a coherent buffer for cpu
> > > > > and devices. On native x86 and ARMv8 is sufficient to call
> > > > > __get_free_pages in order to get a coherent buffer, while on ARM we need
> > > > > to call arm_dma_ops.alloc.
> > > >
> > > > Don't bet on this for ARMv8. It's not mandated for the architecture, so
> > > > at some point some SoC will require non-cacheable buffers for coherency.
> > >
> > > I see.
> > > Would it be better if I implemented xen_alloc_coherent_pages on armv8 by
> > > calling arm64_swiotlb_dma_ops.alloc?
> >
> > What does this buffer do exactly? Is it allocated by guests?
>
> It is allocated by Dom0 to do DMA to/from a device.
> It is the buffer that is going to be returned by dma_map_ops.alloc to
> the caller:
>
> On x86:
> dma_map_ops.alloc -> xen_swiotlb_alloc_coherent -> xen_alloc_coherent_pages -> __get_free_pages
>
> On ARM:
> dma_map_ops.alloc -> xen_swiotlb_alloc_coherent -> xen_alloc_coherent_pages -> arm_dma_ops.alloc
>
> On ARM64
> dma_map_ops.alloc -> xen_swiotlb_alloc_coherent -> xen_alloc_coherent_pages -> ????
OK, I'm getting more confused. Do all the above calls happen in the
guest, Dom0, or a mix?
--
Catalin
On Fri, 6 Sep 2013, Catalin Marinas wrote:
> On Fri, Sep 06, 2013 at 03:59:02PM +0100, Stefano Stabellini wrote:
> > On Fri, 6 Sep 2013, Catalin Marinas wrote:
> > > On Thu, Sep 05, 2013 at 05:43:33PM +0100, Stefano Stabellini wrote:
> > > > On Thu, 5 Sep 2013, Catalin Marinas wrote:
> > > > > On Thu, Aug 29, 2013 at 07:32:32PM +0100, Stefano Stabellini wrote:
> > > > > > xen_swiotlb_alloc_coherent needs to allocate a coherent buffer for cpu
> > > > > > and devices. On native x86 and ARMv8 is sufficient to call
> > > > > > __get_free_pages in order to get a coherent buffer, while on ARM we need
> > > > > > to call arm_dma_ops.alloc.
> > > > >
> > > > > Don't bet on this for ARMv8. It's not mandated for the architecture, so
> > > > > at some point some SoC will require non-cacheable buffers for coherency.
> > > >
> > > > I see.
> > > > Would it be better if I implemented xen_alloc_coherent_pages on armv8 by
> > > > calling arm64_swiotlb_dma_ops.alloc?
> > >
> > > What does this buffer do exactly? Is it allocated by guests?
> >
> > It is allocated by Dom0 to do DMA to/from a device.
> > It is the buffer that is going to be returned by dma_map_ops.alloc to
> > the caller:
> >
> > On x86:
> > dma_map_ops.alloc -> xen_swiotlb_alloc_coherent -> xen_alloc_coherent_pages -> __get_free_pages
> >
> > On ARM:
> > dma_map_ops.alloc -> xen_swiotlb_alloc_coherent -> xen_alloc_coherent_pages -> arm_dma_ops.alloc
> >
> > On ARM64
> > dma_map_ops.alloc -> xen_swiotlb_alloc_coherent -> xen_alloc_coherent_pages -> ????
>
> OK, I'm getting more confused. Do all the above calls happen in the
> guest, Dom0, or a mix?
I guess the confusion comes from a difference in terminology: dom0 is a
guest like the others, just a bit more privileged. We usually call domU
a normal unprivileged guest.
The above calls would happen in Dom0 (when an SMMU is not available).
They could also happen in a DomU if we assign a physical device to it
(and an SMMU is not available).
So in general they would happen in any guest that needs to program a
real device.
On Fri, Sep 06, 2013 at 05:09:52PM +0100, Stefano Stabellini wrote:
> On Fri, 6 Sep 2013, Catalin Marinas wrote:
> > On Fri, Sep 06, 2013 at 03:59:02PM +0100, Stefano Stabellini wrote:
> > > On Fri, 6 Sep 2013, Catalin Marinas wrote:
> > > > On Thu, Sep 05, 2013 at 05:43:33PM +0100, Stefano Stabellini wrote:
> > > > > On Thu, 5 Sep 2013, Catalin Marinas wrote:
> > > > > > On Thu, Aug 29, 2013 at 07:32:32PM +0100, Stefano Stabellini wrote:
> > > > > > > xen_swiotlb_alloc_coherent needs to allocate a coherent buffer for cpu
> > > > > > > and devices. On native x86 and ARMv8 is sufficient to call
> > > > > > > __get_free_pages in order to get a coherent buffer, while on ARM we need
> > > > > > > to call arm_dma_ops.alloc.
> > > > > >
> > > > > > Don't bet on this for ARMv8. It's not mandated for the architecture, so
> > > > > > at some point some SoC will require non-cacheable buffers for coherency.
> > > > >
> > > > > I see.
> > > > > Would it be better if I implemented xen_alloc_coherent_pages on armv8 by
> > > > > calling arm64_swiotlb_dma_ops.alloc?
> > > >
> > > > What does this buffer do exactly? Is it allocated by guests?
> > >
> > > It is allocated by Dom0 to do DMA to/from a device.
> > > It is the buffer that is going to be returned by dma_map_ops.alloc to
> > > the caller:
> > >
> > > On x86:
> > > dma_map_ops.alloc -> xen_swiotlb_alloc_coherent -> xen_alloc_coherent_pages -> __get_free_pages
> > >
> > > On ARM:
> > > dma_map_ops.alloc -> xen_swiotlb_alloc_coherent -> xen_alloc_coherent_pages -> arm_dma_ops.alloc
> > >
> > > On ARM64
> > > dma_map_ops.alloc -> xen_swiotlb_alloc_coherent -> xen_alloc_coherent_pages -> ????
> >
> > OK, I'm getting more confused. Do all the above calls happen in the
> > guest, Dom0, or a mix?
>
> I guess the confusion comes from a difference in terminology: dom0 is a
> guest like the others, just a bit more privileged. We usually call domU
> a normal unprivileged guest.
Thanks for the explanation.
> The above calls would happen in Dom0 (when an SMMU is not available).
So for Dom0, are there cases when it needs xen_swiotlb_alloc_coherent()
and other cases when it needs the arm_dma_ops.alloc? In Dom0 could we
not always use the default dma_alloc_coherent()?
> They could also happen in a DomU if we assign a physical device to it
> (and an SMMU is not available).
The problem is that you don't necessarily know one kind of coherency you
know for a physical device. As I said, we plan to do this DT-driven.
--
Catalin
On Fri, 6 Sep 2013, Catalin Marinas wrote:
> On Fri, Sep 06, 2013 at 05:09:52PM +0100, Stefano Stabellini wrote:
> > On Fri, 6 Sep 2013, Catalin Marinas wrote:
> > > On Fri, Sep 06, 2013 at 03:59:02PM +0100, Stefano Stabellini wrote:
> > > > On Fri, 6 Sep 2013, Catalin Marinas wrote:
> > > > > On Thu, Sep 05, 2013 at 05:43:33PM +0100, Stefano Stabellini wrote:
> > > > > > On Thu, 5 Sep 2013, Catalin Marinas wrote:
> > > > > > > On Thu, Aug 29, 2013 at 07:32:32PM +0100, Stefano Stabellini wrote:
> > > > > > > > xen_swiotlb_alloc_coherent needs to allocate a coherent buffer for cpu
> > > > > > > > and devices. On native x86 and ARMv8 is sufficient to call
> > > > > > > > __get_free_pages in order to get a coherent buffer, while on ARM we need
> > > > > > > > to call arm_dma_ops.alloc.
> > > > > > >
> > > > > > > Don't bet on this for ARMv8. It's not mandated for the architecture, so
> > > > > > > at some point some SoC will require non-cacheable buffers for coherency.
> > > > > >
> > > > > > I see.
> > > > > > Would it be better if I implemented xen_alloc_coherent_pages on armv8 by
> > > > > > calling arm64_swiotlb_dma_ops.alloc?
> > > > >
> > > > > What does this buffer do exactly? Is it allocated by guests?
> > > >
> > > > It is allocated by Dom0 to do DMA to/from a device.
> > > > It is the buffer that is going to be returned by dma_map_ops.alloc to
> > > > the caller:
> > > >
> > > > On x86:
> > > > dma_map_ops.alloc -> xen_swiotlb_alloc_coherent -> xen_alloc_coherent_pages -> __get_free_pages
> > > >
> > > > On ARM:
> > > > dma_map_ops.alloc -> xen_swiotlb_alloc_coherent -> xen_alloc_coherent_pages -> arm_dma_ops.alloc
> > > >
> > > > On ARM64
> > > > dma_map_ops.alloc -> xen_swiotlb_alloc_coherent -> xen_alloc_coherent_pages -> ????
> > >
> > > OK, I'm getting more confused. Do all the above calls happen in the
> > > guest, Dom0, or a mix?
> >
> > I guess the confusion comes from a difference in terminology: dom0 is a
> > guest like the others, just a bit more privileged. We usually call domU
> > a normal unprivileged guest.
>
> Thanks for the explanation.
>
> > The above calls would happen in Dom0 (when an SMMU is not available).
>
> So for Dom0, are there cases when it needs xen_swiotlb_alloc_coherent()
> and other cases when it needs the arm_dma_ops.alloc? In Dom0 could we
> not always use the default dma_alloc_coherent()?
Keep in mind that dom0 runs with second stage translation enabled. This
means that what dom0 thinks is a physical address (machine address in
Xen terminology), it's actually just an intermediate physical address.
Also for the same reason what dom0 thinks is a contiguous buffer, it's
actually only contiguous in the intermediate physical address space.
So every time dom0 wants to allocate a dma-capable buffer it needs to go
via swiotlb-xen, that makes the buffer contiguous in the physical address
space (machine address space in Xen terminology) by issuing an hypercall.
swiotlb-xen also returns the physical address (machine address in Xen
terminology) to the caller.
To answer your question: in absence of an SMMU, all the
dma_alloc_coherent calls in dom0 need to go via xen_swiotlb_alloc_coherent.
xen_swiotlb_alloc_coherent cannot allocate a contigous buffer in physical
address space (see above), but it has to allocate a buffer coherent from
the caching attributes point of view. The hypervisor is going to take
care of making the allocated buffer really contiguous in physical address
space.
So now the problem is: how is xen_swiotlb_alloc_coherent going to
allocate a coherent buffer?
On x86 I can just call __get_free_pages.
On ARM I have to call arm_dma_ops.alloc.
On ARM64 ???
BTW if the Matrix is your kind of fun, I wrote an blog post explaining the
swiotlb Morpheus style:
http://blog.xen.org/index.php/2013/08/14/swiotlb-by-morpheus/
> > They could also happen in a DomU if we assign a physical device to it
> > (and an SMMU is not available).
>
> The problem is that you don't necessarily know one kind of coherency you
> know for a physical device. As I said, we plan to do this DT-driven.
OK, but if I call arm64_swiotlb_dma_ops.alloc passing the right
arguments to it, I should be able to get the right coherency for the
right device, correct?
On 6 Sep 2013, at 17:52, Stefano Stabellini <[email protected]> wrote:
> On Fri, 6 Sep 2013, Catalin Marinas wrote:
>> On Fri, Sep 06, 2013 at 05:09:52PM +0100, Stefano Stabellini wrote:
>>> On Fri, 6 Sep 2013, Catalin Marinas wrote:
>>>> On Fri, Sep 06, 2013 at 03:59:02PM +0100, Stefano Stabellini wrote:
>>>>> On Fri, 6 Sep 2013, Catalin Marinas wrote:
>>>>>> On Thu, Sep 05, 2013 at 05:43:33PM +0100, Stefano Stabellini wrote:
>>>>>>> On Thu, 5 Sep 2013, Catalin Marinas wrote:
>>>>>>>> On Thu, Aug 29, 2013 at 07:32:32PM +0100, Stefano Stabellini wrote:
>>>>>>>>> xen_swiotlb_alloc_coherent needs to allocate a coherent buffer for cpu
>>>>>>>>> and devices. On native x86 and ARMv8 is sufficient to call
>>>>>>>>> __get_free_pages in order to get a coherent buffer, while on ARM we need
>>>>>>>>> to call arm_dma_ops.alloc.
>>>>>>>>
>>>>>>>> Don't bet on this for ARMv8. It's not mandated for the architecture, so
>>>>>>>> at some point some SoC will require non-cacheable buffers for coherency.
>>>>>>>
>>>>>>> I see.
>>>>>>> Would it be better if I implemented xen_alloc_coherent_pages on armv8 by
>>>>>>> calling arm64_swiotlb_dma_ops.alloc?
>>>>>>
>>>>>> What does this buffer do exactly? Is it allocated by guests?
>>>>>
>>>>> It is allocated by Dom0 to do DMA to/from a device.
>>>>> It is the buffer that is going to be returned by dma_map_ops.alloc to
>>>>> the caller:
>>>>>
>>>>> On x86:
>>>>> dma_map_ops.alloc -> xen_swiotlb_alloc_coherent -> xen_alloc_coherent_pages -> __get_free_pages
>>>>>
>>>>> On ARM:
>>>>> dma_map_ops.alloc -> xen_swiotlb_alloc_coherent -> xen_alloc_coherent_pages -> arm_dma_ops.alloc
>>>>>
>>>>> On ARM64
>>>>> dma_map_ops.alloc -> xen_swiotlb_alloc_coherent -> xen_alloc_coherent_pages -> ????
>>>>
>>>> OK, I'm getting more confused. Do all the above calls happen in the
>>>> guest, Dom0, or a mix?
>>>
>>> I guess the confusion comes from a difference in terminology: dom0 is a
>>> guest like the others, just a bit more privileged. We usually call domU
>>> a normal unprivileged guest.
>>
>> Thanks for the explanation.
>>
>>> The above calls would happen in Dom0 (when an SMMU is not available).
>>
>> So for Dom0, are there cases when it needs xen_swiotlb_alloc_coherent()
>> and other cases when it needs the arm_dma_ops.alloc? In Dom0 could we
>> not always use the default dma_alloc_coherent()?
>
> Keep in mind that dom0 runs with second stage translation enabled. This
> means that what dom0 thinks is a physical address (machine address in
> Xen terminology), it's actually just an intermediate physical address.
> Also for the same reason what dom0 thinks is a contiguous buffer, it's
> actually only contiguous in the intermediate physical address space.
OK, it makes sense now. I thought dom0 is like the KVM host where stage
2 is disabled (or just flat).
> BTW if the Matrix is your kind of fun, I wrote an blog post explaining the
> swiotlb Morpheus style:
> http://blog.xen.org/index.php/2013/08/14/swiotlb-by-morpheus/
That was easier to understand ;)
>>> They could also happen in a DomU if we assign a physical device to it
>>> (and an SMMU is not available).
>>
>> The problem is that you don't necessarily know one kind of coherency you
>> know for a physical device. As I said, we plan to do this DT-driven.
>
> OK, but if I call arm64_swiotlb_dma_ops.alloc passing the right
> arguments to it, I should be able to get the right coherency for the
> right device, correct?
I think it needs a bit more work on the Xen part. Basically
dma_alloc_attrs() calls get_dma_ops() to obtain the best DMA operations
for a device. arm64_swiotlb_dma_ops is just the default implementation
and I'll add a _noncoherent variant as well. Default dma_ops will be
set to one of these during boot. But a device is also allowed to have
its own dev->archdata.dma_ops, set via set_dma_ops().
So even if you set the default dma_ops to Xen ops, you may not get them
via dma_alloc_coherent(). I don't see any easier solution other than
patching the dma_alloc_attrs() function to issue a Hyp call after the
memory has been allocated with the get_dma_ops()->alloc(). But I don't
like this either.
Catalin-
On Mon, 9 Sep 2013, Catalin Marinas wrote:
> >>> They could also happen in a DomU if we assign a physical device to it
> >>> (and an SMMU is not available).
> >>
> >> The problem is that you don't necessarily know one kind of coherency you
> >> know for a physical device. As I said, we plan to do this DT-driven.
> >
> > OK, but if I call arm64_swiotlb_dma_ops.alloc passing the right
> > arguments to it, I should be able to get the right coherency for the
> > right device, correct?
>
> I think it needs a bit more work on the Xen part. Basically
> dma_alloc_attrs() calls get_dma_ops() to obtain the best DMA operations
> for a device. arm64_swiotlb_dma_ops is just the default implementation
> and I'll add a _noncoherent variant as well. Default dma_ops will be
> set to one of these during boot. But a device is also allowed to have
> its own dev->archdata.dma_ops, set via set_dma_ops().
>
> So even if you set the default dma_ops to Xen ops, you may not get them
> via dma_alloc_coherent(). I don't see any easier solution other than
> patching the dma_alloc_attrs() function to issue a Hyp call after the
> memory has been allocated with the get_dma_ops()->alloc(). But I don't
> like this either.
I see. This problem affects arch/arm as well.
Either we add an if (!xen_domain()) in get_dma_ops, or we could make
get_dma_ops a function pointer and let people overwrite it.
See below the first option implemented for arch/arm on top of the
swiotlb series:
diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h
index 7d6e4f9..0b8b5e4 100644
--- a/arch/arm/include/asm/dma-mapping.h
+++ b/arch/arm/include/asm/dma-mapping.h
@@ -12,6 +12,8 @@
#include <asm/memory.h>
#include <asm/cacheflush.h>
+#include <xen/xen.h>
+
#define DMA_ERROR_CODE (~0)
extern struct dma_map_ops *dma_ops;
extern struct dma_map_ops arm_dma_ops;
@@ -19,7 +21,7 @@ extern struct dma_map_ops arm_coherent_dma_ops;
static inline struct dma_map_ops *get_dma_ops(struct device *dev)
{
- if (dev && dev->archdata.dma_ops)
+ if (!xen_domain() && dev && dev->archdata.dma_ops)
return dev->archdata.dma_ops;
return dma_ops;
}
diff --git a/arch/arm/include/asm/xen/page-coherent.h b/arch/arm/include/asm/xen/page-coherent.h
index af2cf8d..c2232fe 100644
--- a/arch/arm/include/asm/xen/page-coherent.h
+++ b/arch/arm/include/asm/xen/page-coherent.h
@@ -9,6 +9,8 @@ static inline void *xen_alloc_coherent_pages(struct device *hwdev, size_t size,
dma_addr_t *dma_handle, gfp_t flags,
struct dma_attrs *attrs)
{
+ if (hwdev && hwdev->archdata.dma_ops)
+ return hwdev->archdata.dma_ops->alloc(hwdev, size, dma_handle, flags, attrs);
return arm_dma_ops.alloc(hwdev, size, dma_handle, flags, attrs);
}
@@ -16,6 +18,8 @@ static inline void xen_free_coherent_pages(struct device *hwdev, size_t size,
void *cpu_addr, dma_addr_t dma_handle,
struct dma_attrs *attrs)
{
+ if (hwdev && hwdev->archdata.dma_ops)
+ return hwdev->archdata.dma_ops->free(hwdev, size, cpu_addr, dma_handle, attrs);
return arm_dma_ops.free(hwdev, size, cpu_addr, dma_handle, attrs);
}
On Mon, Sep 09, 2013 at 05:46:59PM +0100, Stefano Stabellini wrote:
> On Mon, 9 Sep 2013, Catalin Marinas wrote:
> > >>> They could also happen in a DomU if we assign a physical device to it
> > >>> (and an SMMU is not available).
> > >>
> > >> The problem is that you don't necessarily know one kind of coherency you
> > >> know for a physical device. As I said, we plan to do this DT-driven.
> > >
> > > OK, but if I call arm64_swiotlb_dma_ops.alloc passing the right
> > > arguments to it, I should be able to get the right coherency for the
> > > right device, correct?
> >
> > I think it needs a bit more work on the Xen part. Basically
> > dma_alloc_attrs() calls get_dma_ops() to obtain the best DMA operations
> > for a device. arm64_swiotlb_dma_ops is just the default implementation
> > and I'll add a _noncoherent variant as well. Default dma_ops will be
> > set to one of these during boot. But a device is also allowed to have
> > its own dev->archdata.dma_ops, set via set_dma_ops().
> >
> > So even if you set the default dma_ops to Xen ops, you may not get them
> > via dma_alloc_coherent(). I don't see any easier solution other than
> > patching the dma_alloc_attrs() function to issue a Hyp call after the
> > memory has been allocated with the get_dma_ops()->alloc(). But I don't
> > like this either.
>
> I see. This problem affects arch/arm as well.
> Either we add an if (!xen_domain()) in get_dma_ops, or we could make
> get_dma_ops a function pointer and let people overwrite it.
> See below the first option implemented for arch/arm on top of the
> swiotlb series:
>
>
> diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h
> index 7d6e4f9..0b8b5e4 100644
> --- a/arch/arm/include/asm/dma-mapping.h
> +++ b/arch/arm/include/asm/dma-mapping.h
> @@ -12,6 +12,8 @@
> #include <asm/memory.h>
> #include <asm/cacheflush.h>
>
> +#include <xen/xen.h>
> +
> #define DMA_ERROR_CODE (~0)
> extern struct dma_map_ops *dma_ops;
> extern struct dma_map_ops arm_dma_ops;
> @@ -19,7 +21,7 @@ extern struct dma_map_ops arm_coherent_dma_ops;
>
> static inline struct dma_map_ops *get_dma_ops(struct device *dev)
> {
> - if (dev && dev->archdata.dma_ops)
> + if (!xen_domain() && dev && dev->archdata.dma_ops)
> return dev->archdata.dma_ops;
> return dma_ops;
> }
> diff --git a/arch/arm/include/asm/xen/page-coherent.h b/arch/arm/include/asm/xen/page-coherent.h
> index af2cf8d..c2232fe 100644
> --- a/arch/arm/include/asm/xen/page-coherent.h
> +++ b/arch/arm/include/asm/xen/page-coherent.h
> @@ -9,6 +9,8 @@ static inline void *xen_alloc_coherent_pages(struct device *hwdev, size_t size,
> dma_addr_t *dma_handle, gfp_t flags,
> struct dma_attrs *attrs)
> {
> + if (hwdev && hwdev->archdata.dma_ops)
> + return hwdev->archdata.dma_ops->alloc(hwdev, size, dma_handle, flags, attrs);
> return arm_dma_ops.alloc(hwdev, size, dma_handle, flags, attrs);
I still don't like xen digging into the archdata internals. What about:
diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h
index 5b579b9..5fa472c 100644
--- a/arch/arm/include/asm/dma-mapping.h
+++ b/arch/arm/include/asm/dma-mapping.h
@@ -15,13 +15,20 @@
extern struct dma_map_ops arm_dma_ops;
extern struct dma_map_ops arm_coherent_dma_ops;
-static inline struct dma_map_ops *get_dma_ops(struct device *dev)
+static inline struct dma_map_ops *__get_dma_ops(struct device *dev)
{
if (dev && dev->archdata.dma_ops)
return dev->archdata.dma_ops;
return &arm_dma_ops;
}
+static inline struct dma_map_ops *get_dma_ops(struct device *dev)
+{
+ if (xen_domain())
+ return xen_dma_ops;
+ return __get_dma_ops(dev);
+}
+
static inline void set_dma_ops(struct device *dev, struct dma_map_ops *ops)
{
BUG_ON(!dev);
@@ -32,7 +39,7 @@ static inline void set_dma_ops(struct device *dev, struct dma_map_ops *ops)
static inline int dma_set_mask(struct device *dev, u64 mask)
{
- return get_dma_ops(dev)->set_dma_mask(dev, mask);
+ return __get_dma_ops(dev)->set_dma_mask(dev, mask);
}
#ifdef __arch_page_to_dma
And in xen_alloc_coherent_pages():
return __get_dma_ops(dev).alloc(...);
Alternatively, add the xen_domain() check in dma_alloc_attrs() instead
of get_dma_ops() (and other functions like map_sg etc.)
--
Catalin
On Wed, 11 Sep 2013, Catalin Marinas wrote:
> On Mon, Sep 09, 2013 at 05:46:59PM +0100, Stefano Stabellini wrote:
> > On Mon, 9 Sep 2013, Catalin Marinas wrote:
> > > >>> They could also happen in a DomU if we assign a physical device to it
> > > >>> (and an SMMU is not available).
> > > >>
> > > >> The problem is that you don't necessarily know one kind of coherency you
> > > >> know for a physical device. As I said, we plan to do this DT-driven.
> > > >
> > > > OK, but if I call arm64_swiotlb_dma_ops.alloc passing the right
> > > > arguments to it, I should be able to get the right coherency for the
> > > > right device, correct?
> > >
> > > I think it needs a bit more work on the Xen part. Basically
> > > dma_alloc_attrs() calls get_dma_ops() to obtain the best DMA operations
> > > for a device. arm64_swiotlb_dma_ops is just the default implementation
> > > and I'll add a _noncoherent variant as well. Default dma_ops will be
> > > set to one of these during boot. But a device is also allowed to have
> > > its own dev->archdata.dma_ops, set via set_dma_ops().
> > >
> > > So even if you set the default dma_ops to Xen ops, you may not get them
> > > via dma_alloc_coherent(). I don't see any easier solution other than
> > > patching the dma_alloc_attrs() function to issue a Hyp call after the
> > > memory has been allocated with the get_dma_ops()->alloc(). But I don't
> > > like this either.
> >
> > I see. This problem affects arch/arm as well.
> > Either we add an if (!xen_domain()) in get_dma_ops, or we could make
> > get_dma_ops a function pointer and let people overwrite it.
> > See below the first option implemented for arch/arm on top of the
> > swiotlb series:
> >
> >
> > diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h
> > index 7d6e4f9..0b8b5e4 100644
> > --- a/arch/arm/include/asm/dma-mapping.h
> > +++ b/arch/arm/include/asm/dma-mapping.h
> > @@ -12,6 +12,8 @@
> > #include <asm/memory.h>
> > #include <asm/cacheflush.h>
> >
> > +#include <xen/xen.h>
> > +
> > #define DMA_ERROR_CODE (~0)
> > extern struct dma_map_ops *dma_ops;
> > extern struct dma_map_ops arm_dma_ops;
> > @@ -19,7 +21,7 @@ extern struct dma_map_ops arm_coherent_dma_ops;
> >
> > static inline struct dma_map_ops *get_dma_ops(struct device *dev)
> > {
> > - if (dev && dev->archdata.dma_ops)
> > + if (!xen_domain() && dev && dev->archdata.dma_ops)
> > return dev->archdata.dma_ops;
> > return dma_ops;
> > }
> > diff --git a/arch/arm/include/asm/xen/page-coherent.h b/arch/arm/include/asm/xen/page-coherent.h
> > index af2cf8d..c2232fe 100644
> > --- a/arch/arm/include/asm/xen/page-coherent.h
> > +++ b/arch/arm/include/asm/xen/page-coherent.h
> > @@ -9,6 +9,8 @@ static inline void *xen_alloc_coherent_pages(struct device *hwdev, size_t size,
> > dma_addr_t *dma_handle, gfp_t flags,
> > struct dma_attrs *attrs)
> > {
> > + if (hwdev && hwdev->archdata.dma_ops)
> > + return hwdev->archdata.dma_ops->alloc(hwdev, size, dma_handle, flags, attrs);
> > return arm_dma_ops.alloc(hwdev, size, dma_handle, flags, attrs);
>
> I still don't like xen digging into the archdata internals.
Me neither.
> What about:
>
> diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h
> index 5b579b9..5fa472c 100644
> --- a/arch/arm/include/asm/dma-mapping.h
> +++ b/arch/arm/include/asm/dma-mapping.h
> @@ -15,13 +15,20 @@
> extern struct dma_map_ops arm_dma_ops;
> extern struct dma_map_ops arm_coherent_dma_ops;
>
> -static inline struct dma_map_ops *get_dma_ops(struct device *dev)
> +static inline struct dma_map_ops *__get_dma_ops(struct device *dev)
> {
> if (dev && dev->archdata.dma_ops)
> return dev->archdata.dma_ops;
> return &arm_dma_ops;
> }
>
> +static inline struct dma_map_ops *get_dma_ops(struct device *dev)
> +{
> + if (xen_domain())
> + return xen_dma_ops;
> + return __get_dma_ops(dev);
> +}
I agree that this is better.
> static inline void set_dma_ops(struct device *dev, struct dma_map_ops *ops)
> {
> BUG_ON(!dev);
> @@ -32,7 +39,7 @@ static inline void set_dma_ops(struct device *dev, struct dma_map_ops *ops)
>
> static inline int dma_set_mask(struct device *dev, u64 mask)
> {
> - return get_dma_ops(dev)->set_dma_mask(dev, mask);
> + return __get_dma_ops(dev)->set_dma_mask(dev, mask);
> }
>
> #ifdef __arch_page_to_dma
I don't understand the reason for this change though: shouldn't
set_dma_mask go via the "default" (whatever that is), like the others?
On native it won't make a difference, in any case it will end up calling
arm_set_dma_mask.
On Xen it would make a difference, because
get_dma_ops(dev)->set_dma_mask would end up calling
xen_swiotlb_set_dma_mask, that checks whether the mask is supported by
the swiotlb buffer before setting the mask for the device, while
obviously arm_set_dma_mask doesn't do that.
> And in xen_alloc_coherent_pages():
>
> return __get_dma_ops(dev).alloc(...);
Right.
> Alternatively, add the xen_domain() check in dma_alloc_attrs() instead
> of get_dma_ops() (and other functions like map_sg etc.)
I prefer this approach because it is going to be more concise.
On Wed, Sep 11, 2013 at 06:34:17PM +0100, Stefano Stabellini wrote:
> On Wed, 11 Sep 2013, Catalin Marinas wrote:
> > static inline void set_dma_ops(struct device *dev, struct dma_map_ops *ops)
> > {
> > BUG_ON(!dev);
> > @@ -32,7 +39,7 @@ static inline void set_dma_ops(struct device *dev, struct dma_map_ops *ops)
> >
> > static inline int dma_set_mask(struct device *dev, u64 mask)
> > {
> > - return get_dma_ops(dev)->set_dma_mask(dev, mask);
> > + return __get_dma_ops(dev)->set_dma_mask(dev, mask);
> > }
> >
> > #ifdef __arch_page_to_dma
>
> I don't understand the reason for this change though: shouldn't
> set_dma_mask go via the "default" (whatever that is), like the others?
Under Xen, this would mean xen_dma_ops->set_dma_mask(). I thought you
want to set the default dma address.
I can see why you may want set_dma_mask() for xen_dma_ops but in also
needs to be passed to __get_dma_ops(dev)->set_dma_ops() otherwise the
actual dma_alloc_attrs() will miss the mask (and on arm64 it does
bouncing via swiotlb).
--
Catalin
On Thu, 12 Sep 2013, Catalin Marinas wrote:
> On Wed, Sep 11, 2013 at 06:34:17PM +0100, Stefano Stabellini wrote:
> > On Wed, 11 Sep 2013, Catalin Marinas wrote:
> > > static inline void set_dma_ops(struct device *dev, struct dma_map_ops *ops)
> > > {
> > > BUG_ON(!dev);
> > > @@ -32,7 +39,7 @@ static inline void set_dma_ops(struct device *dev, struct dma_map_ops *ops)
> > >
> > > static inline int dma_set_mask(struct device *dev, u64 mask)
> > > {
> > > - return get_dma_ops(dev)->set_dma_mask(dev, mask);
> > > + return __get_dma_ops(dev)->set_dma_mask(dev, mask);
> > > }
> > >
> > > #ifdef __arch_page_to_dma
> >
> > I don't understand the reason for this change though: shouldn't
> > set_dma_mask go via the "default" (whatever that is), like the others?
>
> Under Xen, this would mean xen_dma_ops->set_dma_mask(). I thought you
> want to set the default dma address.
>
> I can see why you may want set_dma_mask() for xen_dma_ops but in also
> needs to be passed to __get_dma_ops(dev)->set_dma_ops() otherwise the
> actual dma_alloc_attrs() will miss the mask (and on arm64 it does
> bouncing via swiotlb).
xen_dma_ops->set_dma_mask() is implemented by
drivers/xen/swiotlb-xen.c:xen_swiotlb_set_dma_mask (provided by this
patch series). The function takes care of setting:
*dev->dma_mask = dma_mask;
I think that should be enough to make dma_alloc_attrs happy.
On Thu, Sep 12, 2013 at 03:44:50PM +0100, Stefano Stabellini wrote:
> On Thu, 12 Sep 2013, Catalin Marinas wrote:
> > On Wed, Sep 11, 2013 at 06:34:17PM +0100, Stefano Stabellini wrote:
> > > On Wed, 11 Sep 2013, Catalin Marinas wrote:
> > > > static inline void set_dma_ops(struct device *dev, struct dma_map_ops *ops)
> > > > {
> > > > BUG_ON(!dev);
> > > > @@ -32,7 +39,7 @@ static inline void set_dma_ops(struct device *dev, struct dma_map_ops *ops)
> > > >
> > > > static inline int dma_set_mask(struct device *dev, u64 mask)
> > > > {
> > > > - return get_dma_ops(dev)->set_dma_mask(dev, mask);
> > > > + return __get_dma_ops(dev)->set_dma_mask(dev, mask);
> > > > }
> > > >
> > > > #ifdef __arch_page_to_dma
> > >
> > > I don't understand the reason for this change though: shouldn't
> > > set_dma_mask go via the "default" (whatever that is), like the others?
> >
> > Under Xen, this would mean xen_dma_ops->set_dma_mask(). I thought you
> > want to set the default dma address.
> >
> > I can see why you may want set_dma_mask() for xen_dma_ops but in also
> > needs to be passed to __get_dma_ops(dev)->set_dma_ops() otherwise the
> > actual dma_alloc_attrs() will miss the mask (and on arm64 it does
> > bouncing via swiotlb).
>
> xen_dma_ops->set_dma_mask() is implemented by
> drivers/xen/swiotlb-xen.c:xen_swiotlb_set_dma_mask (provided by this
> patch series). The function takes care of setting:
>
> *dev->dma_mask = dma_mask;
>
> I think that should be enough to make dma_alloc_attrs happy.
I think it would be safer to call __get_dma_ops()->set_dma_mask() in
case the driver overrides that and does more than dev->dma_mask setting.
I don't know whether this is the case though.
--
Catalin