2009-03-13 17:00:53

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: [GIT PULL] Xen dom0 hardware access support


This series of patches enables the dom0 guest to access and control hardware. The changes fall into three groups:

1. DMA and swiotlb updates
- Implement Xen versions of all the DMA API mapping ops
- Put Xen-specific bits into the swiotlb hooks where needed, and define
Xen+swiotlb set of ops

2. DRM updates
- make sure _PAGE_IOMAP is set on VM_IO mappings, as created by remap_pfn_range
- use swiotlb_bus_to_phys/phys_to_bus to implement phys_to_gart/gart_to_phys
(despite the swiotlb_* name, the functions are fairly generic, at least on x86)
- Use dma_alloc_coherent for alloc_gatt_pages, to make sure they're physically
contiguous
- Likewise, use dma_alloc_coherent for the special i8xx ARGB cursor memory

3. MTRR improvements (to make /proc/mtrr fully functional)
Complete the MTRR implementation introduced in the xen/dom0/core series

The following changes since commit 089faa06184f85284ba6c4164dd7b4741ca5d5c5:
Jeremy Fitzhardinge (1):
x86: don't need "changed" parameter for set_io_bitmap()

are available in the git repository at:

git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen.git push/xen/dom0/hardware

Alex Nixon (7):
xen: Don't disable the I/O space
xen: Allow unprivileged Xen domains to create iomap pages
Xen: Rename the balloon lock
xen: Add xen_create_contiguous_region
x86/PCI: Clean up pci_cache_line_size
x86/PCI: Enable scanning of all pci functions
Xen/x86/PCI: Add support for the Xen PCI subsytem

Ian Campbell (4):
xen swiotlb: fixup swiotlb is chunks smaller than MAX_CONTIG_ORDER
xen: add hooks for mapping phys<->bus addresses in swiotlb
xen/swiotlb: add hook for swiotlb_arch_range_needs_mapping
xen: enable swiotlb for xen domain 0.

Jeremy Fitzhardinge (12):
x86/pci: make sure _PAGE_IOMAP it set on pci mappings
xen/pci: clean up Kconfig a bit
xen: make sure swiotlb allocation is physically contigious
xen/swiotlb: use dma_alloc_from_coherent to get device coherent memory
swiotlb: use swiotlb_alloc_boot to allocate emergency pool
xen/swiotlb: improve comment on gfp flags in xen_alloc_coherent()
xen/swiotlb: add sync functions
xen: set callout_map to make mtrr work
x86: define arch_vm_get_page_prot to set _PAGE_IOMAP on VM_IO vmas
agp: use more dma-ops-like operations for agp memory
agp/intel: use dma_alloc_coherent for special cursor memory
Merge branches 'push/xen/dom0/drm', 'push/xen/dom0/mtrr' and 'push/xen/dom0/pci' into push/xen/dom0/hardware

Mark McLoughlin (5):
xen mtrr: Use specific cpu_has_foo macros instead of generic cpu_has()
xen mtrr: Use generic_validate_add_page()
xen mtrr: Implement xen_get_free_region()
xen mtrr: Add xen_{get,set}_mtrr() implementations
xen mtrr: Kill some unneccessary includes

arch/x86/Kconfig | 4 +
arch/x86/include/asm/agp.h | 15 ++-
arch/x86/include/asm/pci.h | 8 +-
arch/x86/include/asm/pci_x86.h | 2 +
arch/x86/include/asm/pgtable.h | 3 +
arch/x86/include/asm/xen/iommu.h | 12 ++
arch/x86/kernel/cpu/mtrr/mtrr.h | 2 +
arch/x86/kernel/cpu/mtrr/xen.c | 101 +++++++++---
arch/x86/kernel/pci-dma.c | 3 +
arch/x86/kernel/pci-swiotlb.c | 28 +++-
arch/x86/mm/pgtable.c | 10 ++
arch/x86/pci/Makefile | 1 +
arch/x86/pci/common.c | 18 ++-
arch/x86/pci/i386.c | 3 +
arch/x86/pci/init.c | 6 +
arch/x86/pci/xen.c | 52 ++++++
arch/x86/xen/Kconfig | 3 +
arch/x86/xen/enlighten.c | 6 +-
arch/x86/xen/mmu.c | 225 +++++++++++++++++++++++++-
arch/x86/xen/setup.c | 3 -
arch/x86/xen/smp.c | 1 +
drivers/char/agp/intel-agp.c | 26 ++--
drivers/pci/Makefile | 2 +
drivers/pci/xen-iommu.c | 331 ++++++++++++++++++++++++++++++++++++++
drivers/xen/balloon.c | 15 +--
include/xen/interface/memory.h | 50 ++++++
include/xen/swiotlb.h | 19 +++
include/xen/xen-ops.h | 6 +
lib/swiotlb.c | 5 +-
29 files changed, 890 insertions(+), 70 deletions(-)
create mode 100644 arch/x86/include/asm/xen/iommu.h
create mode 100644 arch/x86/pci/xen.c
create mode 100644 drivers/pci/xen-iommu.c
create mode 100644 include/xen/swiotlb.h


2009-03-13 17:00:39

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: [PATCH 01/27] xen: Don't disable the I/O space

From: Alex Nixon <[email protected]>

If a guest domain wants to access PCI devices through the frontend
driver (coming later in the patch series), it will need access to the
I/O space.

Signed-off-by: Alex Nixon <[email protected]>
Signed-off-by: Jeremy Fitzhardinge <[email protected]>
---
arch/x86/xen/setup.c | 3 ---
1 files changed, 0 insertions(+), 3 deletions(-)

diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c
index 175396c..5f1b88c 100644
--- a/arch/x86/xen/setup.c
+++ b/arch/x86/xen/setup.c
@@ -230,8 +230,5 @@ void __init xen_arch_setup(void)

pm_idle = xen_idle;

- if (!xen_initial_domain())
- paravirt_disable_iospace();
-
fiddle_vdso();
}
--
1.6.0.6

2009-03-13 17:01:39

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: [PATCH 02/27] xen: Allow unprivileged Xen domains to create iomap pages

From: Alex Nixon <[email protected]>

PV DomU domains are allowed to map hardware MFNs for PCI passthrough,
but are not generally allowed to map raw machine pages. In particular,
various pieces of code try to map DMI and ACPI tables in the ISA ROM
range. We disallow _PAGE_IOMAP for those mappings, so that they are
redirected to a set of local zeroed pages we reserve for that purpose.

Signed-off-by: Alex Nixon <[email protected]>
Signed-off-by: Jeremy Fitzhardinge <[email protected]>
---
arch/x86/xen/enlighten.c | 6 +++---
arch/x86/xen/mmu.c | 18 +++++++++++++++---
2 files changed, 18 insertions(+), 6 deletions(-)

diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index f673cd8..0605f19 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -998,11 +998,11 @@ asmlinkage void __init xen_start_kernel(void)

/* Prevent unwanted bits from being set in PTEs. */
__supported_pte_mask &= ~_PAGE_GLOBAL;
- if (xen_initial_domain())
- __supported_pte_mask |= _PAGE_IOMAP;
- else
+ if (!xen_initial_domain())
__supported_pte_mask &= ~(_PAGE_PWT | _PAGE_PCD);

+ __supported_pte_mask |= _PAGE_IOMAP;
+
/* Don't do the full vcpu_info placement stuff until we have a
possible map and a non-dummy shared_info. */
per_cpu(xen_vcpu, 0) = &HYPERVISOR_shared_info->vcpu_info[0];
diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c
index 0a0cd61..58af0b0 100644
--- a/arch/x86/xen/mmu.c
+++ b/arch/x86/xen/mmu.c
@@ -49,6 +49,7 @@
#include <asm/mmu_context.h>
#include <asm/setup.h>
#include <asm/paravirt.h>
+#include <asm/e820.h>
#include <asm/linkage.h>

#include <asm/xen/hypercall.h>
@@ -398,7 +399,7 @@ static bool xen_page_pinned(void *ptr)

static bool xen_iomap_pte(pte_t pte)
{
- return xen_initial_domain() && (pte_flags(pte) & _PAGE_IOMAP);
+ return pte_flags(pte) & _PAGE_IOMAP;
}

static void xen_set_iomap_pte(pte_t *ptep, pte_t pteval)
@@ -600,10 +601,21 @@ PV_CALLEE_SAVE_REGS_THUNK(xen_pgd_val);

pte_t xen_make_pte(pteval_t pte)
{
- if (unlikely(xen_initial_domain() && (pte & _PAGE_IOMAP)))
+ phys_addr_t addr = (pte & PTE_PFN_MASK);
+
+ /*
+ * Unprivileged domains are allowed to do IOMAPpings for
+ * PCI passthrough, but not map ISA space. The ISA
+ * mappings are just dummy local mappings to keep other
+ * parts of the kernel happy.
+ */
+ if (unlikely(pte & _PAGE_IOMAP) &&
+ (xen_initial_domain() || addr >= ISA_END_ADDRESS)) {
pte = iomap_pte(pte);
- else
+ } else {
+ pte &= ~_PAGE_IOMAP;
pte = pte_pfn_to_mfn(pte);
+ }

return native_make_pte(pte);
}
--
1.6.0.6

2009-03-13 17:01:19

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: [PATCH 04/27] xen: Add xen_create_contiguous_region

From: Alex Nixon <[email protected]>

A memory region must be physically contiguous in order to be accessed
through DMA. This patch adds xen_create_contiguous_region, which
ensures a region of contiguous virtual memory is also physically
contiguous.

Based on Stephen Tweedie's port of the 2.6.18-xen version.

Remove contiguous_bitmap[] as it's no longer needed.

Ported from linux-2.6.18-xen.hg 707:e410857fd83c

Signed-off-by: Alex Nixon <[email protected]>
Signed-off-by: Jeremy Fitzhardinge <[email protected]>
Signed-off-by: Ian Campbell <[email protected]>
---
arch/x86/xen/mmu.c | 200 ++++++++++++++++++++++++++++++++++++++++
include/xen/interface/memory.h | 42 +++++++++
include/xen/xen-ops.h | 6 +
3 files changed, 248 insertions(+), 0 deletions(-)

diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c
index c2aa44b..b3eb2eb 100644
--- a/arch/x86/xen/mmu.c
+++ b/arch/x86/xen/mmu.c
@@ -51,6 +51,7 @@
#include <asm/paravirt.h>
#include <asm/e820.h>
#include <asm/linkage.h>
+#include <asm/page.h>

#include <asm/xen/hypercall.h>
#include <asm/xen/hypervisor.h>
@@ -2097,6 +2098,205 @@ const struct pv_mmu_ops xen_mmu_ops __initdata = {
};


+/* Protected by xen_reservation_lock. */
+#define MAX_CONTIG_ORDER 9 /* 2MB */
+static unsigned long discontig_frames[1<<MAX_CONTIG_ORDER];
+
+#define VOID_PTE (mfn_pte(0, __pgprot(0)))
+static void xen_zap_pfn_range(unsigned long vaddr, unsigned int order,
+ unsigned long *in_frames,
+ unsigned long *out_frames)
+{
+ int i;
+ struct multicall_space mcs;
+
+ xen_mc_batch();
+ for (i = 0; i < (1UL<<order); i++, vaddr += PAGE_SIZE) {
+ mcs = __xen_mc_entry(0);
+
+ if (in_frames)
+ in_frames[i] = virt_to_mfn(vaddr);
+
+ MULTI_update_va_mapping(mcs.mc, vaddr, VOID_PTE, 0);
+ set_phys_to_machine(virt_to_pfn(vaddr), INVALID_P2M_ENTRY);
+
+ if (out_frames)
+ out_frames[i] = virt_to_pfn(vaddr);
+ }
+ xen_mc_issue(0);
+}
+
+/*
+ * Update the pfn-to-mfn mappings for a virtual address range, either to
+ * point to an array of mfns, or contiguously from a single starting
+ * mfn.
+ */
+static void xen_remap_exchanged_ptes(unsigned long vaddr, int order,
+ unsigned long *mfns,
+ unsigned long first_mfn)
+{
+ unsigned i, limit;
+ unsigned long mfn;
+
+ xen_mc_batch();
+
+ limit = 1u << order;
+ for (i = 0; i < limit; i++, vaddr += PAGE_SIZE) {
+ struct multicall_space mcs;
+ unsigned flags;
+
+ mcs = __xen_mc_entry(0);
+ if (mfns)
+ mfn = mfns[i];
+ else
+ mfn = first_mfn + i;
+
+ if (i < (limit - 1))
+ flags = 0;
+ else {
+ if (order == 0)
+ flags = UVMF_INVLPG | UVMF_ALL;
+ else
+ flags = UVMF_TLB_FLUSH | UVMF_ALL;
+ }
+
+ MULTI_update_va_mapping(mcs.mc, vaddr,
+ mfn_pte(mfn, PAGE_KERNEL), flags);
+
+ set_phys_to_machine(virt_to_pfn(vaddr), mfn);
+ }
+
+ xen_mc_issue(0);
+}
+
+/*
+ * Perform the hypercall to exchange a region of our pfns to point to
+ * memory with the required contiguous alignment. Takes the pfns as
+ * input, and populates mfns as output.
+ *
+ * Returns a success code indicating whether the hypervisor was able to
+ * satisfy the request or not.
+ */
+static int xen_exchange_memory(unsigned long extents_in, unsigned int order_in,
+ unsigned long *pfns_in,
+ unsigned long extents_out, unsigned int order_out,
+ unsigned long *mfns_out,
+ unsigned int address_bits)
+{
+ long rc;
+ int success;
+
+ struct xen_memory_exchange exchange = {
+ .in = {
+ .nr_extents = extents_in,
+ .extent_order = order_in,
+ .extent_start = pfns_in,
+ .domid = DOMID_SELF
+ },
+ .out = {
+ .nr_extents = extents_out,
+ .extent_order = order_out,
+ .extent_start = mfns_out,
+ .address_bits = address_bits,
+ .domid = DOMID_SELF
+ }
+ };
+
+ BUG_ON(extents_in << order_in != extents_out << order_out);
+
+ rc = HYPERVISOR_memory_op(XENMEM_exchange, &exchange);
+ success = (exchange.nr_exchanged == extents_in);
+
+ BUG_ON(!success && ((exchange.nr_exchanged != 0) || (rc == 0)));
+ BUG_ON(success && (rc != 0));
+
+ return success;
+}
+
+int xen_create_contiguous_region(unsigned long vstart, unsigned int order,
+ unsigned int address_bits)
+{
+ unsigned long *in_frames = discontig_frames, out_frame;
+ unsigned long flags;
+ int success;
+
+ /*
+ * Currently an auto-translated guest will not perform I/O, nor will
+ * it require PAE page directories below 4GB. Therefore any calls to
+ * this function are redundant and can be ignored.
+ */
+
+ if (xen_feature(XENFEAT_auto_translated_physmap))
+ return 0;
+
+ if (unlikely(order > MAX_CONTIG_ORDER))
+ return -ENOMEM;
+
+ memset((void *) vstart, 0, PAGE_SIZE << order);
+
+ vm_unmap_aliases();
+
+ spin_lock_irqsave(&xen_reservation_lock, flags);
+
+ /* 1. Zap current PTEs, remembering MFNs. */
+ xen_zap_pfn_range(vstart, order, in_frames, NULL);
+
+ /* 2. Get a new contiguous memory extent. */
+ out_frame = virt_to_pfn(vstart);
+ success = xen_exchange_memory(1UL << order, 0, in_frames,
+ 1, order, &out_frame,
+ address_bits);
+
+ /* 3. Map the new extent in place of old pages. */
+ if (success)
+ xen_remap_exchanged_ptes(vstart, order, NULL, out_frame);
+ else
+ xen_remap_exchanged_ptes(vstart, order, in_frames, 0);
+
+ spin_unlock_irqrestore(&xen_reservation_lock, flags);
+
+ return success ? 0 : -ENOMEM;
+}
+EXPORT_SYMBOL_GPL(xen_create_contiguous_region);
+
+void xen_destroy_contiguous_region(unsigned long vstart, unsigned int order)
+{
+ unsigned long *out_frames = discontig_frames, in_frame;
+ unsigned long flags;
+ int success;
+
+ if (xen_feature(XENFEAT_auto_translated_physmap))
+ return;
+
+ if (unlikely(order > MAX_CONTIG_ORDER))
+ return;
+
+ memset((void *) vstart, 0, PAGE_SIZE << order);
+
+ vm_unmap_aliases();
+
+ spin_lock_irqsave(&xen_reservation_lock, flags);
+
+ /* 1. Find start MFN of contiguous extent. */
+ in_frame = virt_to_mfn(vstart);
+
+ /* 2. Zap current PTEs. */
+ xen_zap_pfn_range(vstart, order, NULL, out_frames);
+
+ /* 3. Do the exchange for non-contiguous MFNs. */
+ success = xen_exchange_memory(1, order, &in_frame, 1UL << order,
+ 0, out_frames, 0);
+
+ /* 4. Map new pages in place of old pages. */
+ if (success)
+ xen_remap_exchanged_ptes(vstart, order, out_frames, 0);
+ else
+ xen_remap_exchanged_ptes(vstart, order, NULL, in_frame);
+
+ spin_unlock_irqrestore(&xen_reservation_lock, flags);
+}
+EXPORT_SYMBOL_GPL(xen_destroy_contiguous_region);
+
#ifdef CONFIG_XEN_DEBUG_FS

static struct dentry *d_mmu_debug;
diff --git a/include/xen/interface/memory.h b/include/xen/interface/memory.h
index 9df4bd0..63abbb2 100644
--- a/include/xen/interface/memory.h
+++ b/include/xen/interface/memory.h
@@ -55,6 +55,48 @@ struct xen_memory_reservation {
DEFINE_GUEST_HANDLE_STRUCT(xen_memory_reservation);

/*
+ * An atomic exchange of memory pages. If return code is zero then
+ * @out.extent_list provides GMFNs of the newly-allocated memory.
+ * Returns zero on complete success, otherwise a negative error code.
+ * On complete success then always @nr_exchanged == @in.nr_extents.
+ * On partial success @nr_exchanged indicates how much work was done.
+ */
+#define XENMEM_exchange 11
+struct xen_memory_exchange {
+ /*
+ * [IN] Details of memory extents to be exchanged (GMFN bases).
+ * Note that @in.address_bits is ignored and unused.
+ */
+ struct xen_memory_reservation in;
+
+ /*
+ * [IN/OUT] Details of new memory extents.
+ * We require that:
+ * 1. @in.domid == @out.domid
+ * 2. @in.nr_extents << @in.extent_order ==
+ * @out.nr_extents << @out.extent_order
+ * 3. @in.extent_start and @out.extent_start lists must not overlap
+ * 4. @out.extent_start lists GPFN bases to be populated
+ * 5. @out.extent_start is overwritten with allocated GMFN bases
+ */
+ struct xen_memory_reservation out;
+
+ /*
+ * [OUT] Number of input extents that were successfully exchanged:
+ * 1. The first @nr_exchanged input extents were successfully
+ * deallocated.
+ * 2. The corresponding first entries in the output extent list correctly
+ * indicate the GMFNs that were successfully exchanged.
+ * 3. All other input and output extents are untouched.
+ * 4. If not all input exents are exchanged then the return code of this
+ * command will be non-zero.
+ * 5. THIS FIELD MUST BE INITIALISED TO ZERO BY THE CALLER!
+ */
+ unsigned long nr_exchanged;
+};
+
+DEFINE_GUEST_HANDLE_STRUCT(xen_memory_exchange);
+/*
* Returns the maximum machine frame number of mapped RAM in this system.
* This command always succeeds (it never returns an error code).
* arg == NULL.
diff --git a/include/xen/xen-ops.h b/include/xen/xen-ops.h
index 883a21b..d789c93 100644
--- a/include/xen/xen-ops.h
+++ b/include/xen/xen-ops.h
@@ -14,4 +14,10 @@ void xen_mm_unpin_all(void);
void xen_timer_resume(void);
void xen_arch_resume(void);

+extern unsigned long *xen_contiguous_bitmap;
+int xen_create_contiguous_region(unsigned long vstart, unsigned int order,
+ unsigned int address_bits);
+
+void xen_destroy_contiguous_region(unsigned long vstart, unsigned int order);
+
#endif /* INCLUDE_XEN_OPS_H */
--
1.6.0.6

2009-03-13 17:01:54

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: [PATCH 08/27] x86/pci: make sure _PAGE_IOMAP it set on pci mappings

From: Jeremy Fitzhardinge <[email protected]>

When mapping pci space via /sys or /proc, make sure we're really
doing a hardware mapping by setting _PAGE_IOMAP.

Signed-off-by: Jeremy Fitzhardinge <[email protected]>
---
arch/x86/pci/i386.c | 3 +++
1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/arch/x86/pci/i386.c b/arch/x86/pci/i386.c
index 5ead808..e233e58 100644
--- a/arch/x86/pci/i386.c
+++ b/arch/x86/pci/i386.c
@@ -296,6 +296,9 @@ int pci_mmap_page_range(struct pci_dev *dev, struct vm_area_struct *vma,
return -EINVAL;

prot = pgprot_val(vma->vm_page_prot);
+
+ prot |= _PAGE_IOMAP; /* creating a mapping for IO */
+
if (pat_enabled && write_combine)
prot |= _PAGE_CACHE_WC;
else if (pat_enabled || boot_cpu_data.x86 > 3)
--
1.6.0.6

2009-03-13 17:02:36

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: [PATCH 06/27] x86/PCI: Enable scanning of all pci functions

From: Alex Nixon <[email protected]>

Xen may want to enable scanning of all pci functions - if for example
the device at function 0 is not passed through to the guest, but the
device at function 1 is.

Signed-off-by: Alex Nixon <[email protected]>
Signed-off-by: Jeremy Fitzhardinge <[email protected]>
---
arch/x86/include/asm/pci.h | 8 +++++++-
arch/x86/pci/common.c | 1 +
2 files changed, 8 insertions(+), 1 deletions(-)

diff --git a/arch/x86/include/asm/pci.h b/arch/x86/include/asm/pci.h
index a977de2..cd50c02 100644
--- a/arch/x86/include/asm/pci.h
+++ b/arch/x86/include/asm/pci.h
@@ -21,6 +21,7 @@ struct pci_sysdata {
extern int pci_routeirq;
extern int noioapicquirk;
extern int noioapicreroute;
+extern int pci_scan_all_fns;

/* scan a bus after allocating a pci_sysdata for it */
extern struct pci_bus *pci_scan_bus_on_node(int busno, struct pci_ops *ops,
@@ -48,7 +49,11 @@ extern unsigned int pcibios_assign_all_busses(void);
#else
#define pcibios_assign_all_busses() 0
#endif
-#define pcibios_scan_all_fns(a, b) 0
+
+static inline int pcibios_scan_all_fns(struct pci_bus *bus, int devfn)
+{
+ return pci_scan_all_fns;
+}

extern unsigned long pci_mem_start;
#define PCIBIOS_MIN_IO 0x1000
@@ -99,6 +104,7 @@ extern void pci_iommu_alloc(void);

/* generic pci stuff */
#include <asm-generic/pci.h>
+#undef pcibios_scan_all_fns

#ifdef CONFIG_NUMA
/* Returns the node based on pci bus */
diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
index 081ebd5..05525bf 100644
--- a/arch/x86/pci/common.c
+++ b/arch/x86/pci/common.c
@@ -22,6 +22,7 @@ unsigned int pci_probe = PCI_PROBE_BIOS | PCI_PROBE_CONF1 | PCI_PROBE_CONF2 |
unsigned int pci_early_dump_regs;
static int pci_bf_sort;
int pci_routeirq;
+int pci_scan_all_fns = 0;
int noioapicquirk;
#ifdef CONFIG_X86_REROUTE_FOR_BROKEN_BOOT_IRQS
int noioapicreroute = 0;
--
1.6.0.6

2009-03-13 17:02:20

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: [PATCH 03/27] Xen: Rename the balloon lock

From: Alex Nixon <[email protected]>

* xen_create_contiguous_region needs access to the balloon lock to
ensure memory doesn't change under its feet, so expose the balloon
lock
* Change the name of the lock to xen_reservation_lock, to imply it's
now less-specific usage.

Signed-off-by: Alex Nixon <[email protected]>
---
arch/x86/xen/mmu.c | 7 +++++++
drivers/xen/balloon.c | 15 ++++-----------
include/xen/interface/memory.h | 8 ++++++++
3 files changed, 19 insertions(+), 11 deletions(-)

diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c
index 58af0b0..c2aa44b 100644
--- a/arch/x86/xen/mmu.c
+++ b/arch/x86/xen/mmu.c
@@ -67,6 +67,13 @@

#define MMU_UPDATE_HISTO 30

+/*
+ * Protects atomic reservation decrease/increase against concurrent increases.
+ * Also protects non-atomic updates of current_pages and driver_pages, and
+ * balloon lists.
+ */
+DEFINE_SPINLOCK(xen_reservation_lock);
+
#ifdef CONFIG_XEN_DEBUG_FS

static struct {
diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
index efa4b36..1e7984d 100644
--- a/drivers/xen/balloon.c
+++ b/drivers/xen/balloon.c
@@ -84,13 +84,6 @@ static struct sys_device balloon_sysdev;

static int register_balloon(struct sys_device *sysdev);

-/*
- * Protects atomic reservation decrease/increase against concurrent increases.
- * Also protects non-atomic updates of current_pages and driver_pages, and
- * balloon lists.
- */
-static DEFINE_SPINLOCK(balloon_lock);
-
static struct balloon_stats balloon_stats;

/* We increase/decrease in batches which fit in a page */
@@ -209,7 +202,7 @@ static int increase_reservation(unsigned long nr_pages)
if (nr_pages > ARRAY_SIZE(frame_list))
nr_pages = ARRAY_SIZE(frame_list);

- spin_lock_irqsave(&balloon_lock, flags);
+ spin_lock_irqsave(&xen_reservation_lock, flags);

page = balloon_first_page();
for (i = 0; i < nr_pages; i++) {
@@ -267,7 +260,7 @@ static int increase_reservation(unsigned long nr_pages)
totalram_pages = balloon_stats.current_pages;

out:
- spin_unlock_irqrestore(&balloon_lock, flags);
+ spin_unlock_irqrestore(&xen_reservation_lock, flags);

return 0;
}
@@ -312,7 +305,7 @@ static int decrease_reservation(unsigned long nr_pages)
kmap_flush_unused();
flush_tlb_all();

- spin_lock_irqsave(&balloon_lock, flags);
+ spin_lock_irqsave(&xen_reservation_lock, flags);

/* No more mappings: invalidate P2M and add to balloon. */
for (i = 0; i < nr_pages; i++) {
@@ -329,7 +322,7 @@ static int decrease_reservation(unsigned long nr_pages)
balloon_stats.current_pages -= nr_pages;
totalram_pages = balloon_stats.current_pages;

- spin_unlock_irqrestore(&balloon_lock, flags);
+ spin_unlock_irqrestore(&xen_reservation_lock, flags);

return need_sleep;
}
diff --git a/include/xen/interface/memory.h b/include/xen/interface/memory.h
index f548f7c..9df4bd0 100644
--- a/include/xen/interface/memory.h
+++ b/include/xen/interface/memory.h
@@ -9,6 +9,8 @@
#ifndef __XEN_PUBLIC_MEMORY_H__
#define __XEN_PUBLIC_MEMORY_H__

+#include <linux/spinlock.h>
+
/*
* Increase or decrease the specified domain's memory reservation. Returns a
* -ve errcode on failure, or the # extents successfully allocated or freed.
@@ -184,4 +186,10 @@ DEFINE_GUEST_HANDLE_STRUCT(xen_memory_map);
*/
#define XENMEM_machine_memory_map 10

+/*
+ * Prevent the balloon driver from changing the memory reservation during a driver
+ * critical region.
+ */
+extern spinlock_t xen_reservation_lock;
+
#endif /* __XEN_PUBLIC_MEMORY_H__ */
--
1.6.0.6

2009-03-13 17:03:19

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: [PATCH 10/27] xen: make sure swiotlb allocation is physically contigious

When allocating the swiotlb buffer under Xen, make sure the memory is
physically contiguous so that its really suitable for DMA.

Signed-off-by: Jeremy Fitzhardinge <[email protected]>
---
arch/x86/kernel/pci-swiotlb.c | 17 +++++++++++++++--
drivers/pci/xen-iommu.c | 16 ++++++++++++++++
include/xen/swiotlb.h | 12 ++++++++++++
3 files changed, 43 insertions(+), 2 deletions(-)
create mode 100644 include/xen/swiotlb.h

diff --git a/arch/x86/kernel/pci-swiotlb.c b/arch/x86/kernel/pci-swiotlb.c
index 34f12e9..a7b9410 100644
--- a/arch/x86/kernel/pci-swiotlb.c
+++ b/arch/x86/kernel/pci-swiotlb.c
@@ -11,16 +11,29 @@
#include <asm/swiotlb.h>
#include <asm/dma.h>

+#include <xen/swiotlb.h>
+#include <asm/xen/hypervisor.h>
+
int swiotlb __read_mostly;

void * __init swiotlb_alloc_boot(size_t size, unsigned long nslabs)
{
- return alloc_bootmem_low_pages(size);
+ void *ret = alloc_bootmem_low_pages(size);
+
+ if (ret && xen_pv_domain())
+ xen_swiotlb_fixup(ret, size, nslabs);
+
+ return ret;
}

void *swiotlb_alloc(unsigned order, unsigned long nslabs)
{
- return (void *)__get_free_pages(GFP_DMA | __GFP_NOWARN, order);
+ void *ret = (void *)__get_free_pages(GFP_DMA | __GFP_NOWARN, order);
+
+ if (ret && xen_pv_domain())
+ xen_swiotlb_fixup(ret, 1u << order, nslabs);
+
+ return ret;
}

dma_addr_t swiotlb_phys_to_bus(struct device *hwdev, phys_addr_t paddr)
diff --git a/drivers/pci/xen-iommu.c b/drivers/pci/xen-iommu.c
index 5b701e8..dc0da76 100644
--- a/drivers/pci/xen-iommu.c
+++ b/drivers/pci/xen-iommu.c
@@ -12,6 +12,7 @@
#include <xen/grant_table.h>
#include <xen/page.h>
#include <xen/xen-ops.h>
+#include <xen/swiotlb.h>

#include <asm/iommu.h>
#include <asm/swiotlb.h>
@@ -42,6 +43,21 @@ struct dma_coherent_mem {
unsigned long *bitmap;
};

+void xen_swiotlb_fixup(void *buf, size_t size, unsigned long nslabs)
+{
+ unsigned order = get_order(size);
+
+ printk(KERN_DEBUG "xen_swiotlb_fixup: buf=%p size=%zu order=%u\n",
+ buf, size, order);
+
+ if (WARN_ON(size != (PAGE_SIZE << order)))
+ return;
+
+ if (xen_create_contiguous_region((unsigned long)buf,
+ order, 0xffffffff))
+ printk(KERN_ERR "xen_create_contiguous_region failed\n");
+}
+
static inline int address_needs_mapping(struct device *hwdev,
dma_addr_t addr)
{
diff --git a/include/xen/swiotlb.h b/include/xen/swiotlb.h
new file mode 100644
index 0000000..8d59439
--- /dev/null
+++ b/include/xen/swiotlb.h
@@ -0,0 +1,12 @@
+#ifndef _XEN_SWIOTLB_H
+#define _XEN_SWIOTLB_H
+
+#ifdef CONFIG_PCI_XEN
+extern void xen_swiotlb_fixup(void *buf, size_t size, unsigned long nslabs);
+#else
+static inline void xen_swiotlb_fixup(void *buf, size_t size, unsigned long nslabs)
+{
+}
+#endif
+
+#endif /* _XEN_SWIOTLB_H */
--
1.6.0.6

2009-03-13 17:03:35

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: [PATCH 12/27] xen: add hooks for mapping phys<->bus addresses in swiotlb

From: Ian Campbell <[email protected]>

Add hooks to allow Xen to do translation between pfn and mfns for the swiotlb
layer, so that dma actually ends up going to the proper machine pages.

Signed-off-by: Ian Campbell <[email protected]>
Signed-off-by: Jeremy Fitzhardinge <[email protected]>
---
arch/x86/kernel/pci-swiotlb.c | 8 ++++++++
drivers/pci/xen-iommu.c | 10 ++++++++++
include/xen/swiotlb.h | 3 +++
3 files changed, 21 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kernel/pci-swiotlb.c b/arch/x86/kernel/pci-swiotlb.c
index 298f01d..0943813 100644
--- a/arch/x86/kernel/pci-swiotlb.c
+++ b/arch/x86/kernel/pci-swiotlb.c
@@ -33,11 +33,19 @@ void *swiotlb_alloc(unsigned order, unsigned long nslabs)

dma_addr_t swiotlb_phys_to_bus(struct device *hwdev, phys_addr_t paddr)
{
+#ifdef CONFIG_PCI_XEN
+ if (xen_pv_domain())
+ return xen_phys_to_bus(paddr);
+#endif
return paddr;
}

phys_addr_t swiotlb_bus_to_phys(dma_addr_t baddr)
{
+#ifdef CONFIG_PCI_XEN
+ if (xen_pv_domain())
+ return xen_bus_to_phys(baddr);
+#endif
return baddr;
}

diff --git a/drivers/pci/xen-iommu.c b/drivers/pci/xen-iommu.c
index d546698..e7ba06b 100644
--- a/drivers/pci/xen-iommu.c
+++ b/drivers/pci/xen-iommu.c
@@ -67,6 +67,16 @@ void xen_swiotlb_fixup(void *buf, size_t size, unsigned long nslabs)
}
}

+dma_addr_t xen_phys_to_bus(phys_addr_t paddr)
+{
+ return phys_to_machine(XPADDR(paddr)).maddr;
+}
+
+phys_addr_t xen_bus_to_phys(dma_addr_t daddr)
+{
+ return machine_to_phys(XMADDR(daddr)).paddr;
+}
+
static inline int address_needs_mapping(struct device *hwdev,
dma_addr_t addr)
{
diff --git a/include/xen/swiotlb.h b/include/xen/swiotlb.h
index 8d59439..9ecaff1 100644
--- a/include/xen/swiotlb.h
+++ b/include/xen/swiotlb.h
@@ -9,4 +9,7 @@ static inline void xen_swiotlb_fixup(void *buf, size_t size, unsigned long nslab
}
#endif

+extern phys_addr_t xen_bus_to_phys(dma_addr_t daddr);
+extern dma_addr_t xen_phys_to_bus(phys_addr_t paddr);
+
#endif /* _XEN_SWIOTLB_H */
--
1.6.0.6

2009-03-13 17:04:36

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: [PATCH 09/27] xen/pci: clean up Kconfig a bit

Cut down on the maze of PCI-related config options.

Signed-off-by: Jeremy Fitzhardinge <[email protected]>
---
arch/x86/Kconfig | 4 ++--
arch/x86/xen/Kconfig | 2 ++
2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 9092750..653982c 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1829,8 +1829,8 @@ config PCI_OLPC
depends on PCI && OLPC && (PCI_GOOLPC || PCI_GOANY)

config PCI_XEN
- def_bool y
- depends on XEN_PCI_PASSTHROUGH || XEN_DOM0_PCI
+ bool
+ select SWIOTLB

config PCI_DOMAINS
def_bool y
diff --git a/arch/x86/xen/Kconfig b/arch/x86/xen/Kconfig
index fe69286..87c13db 100644
--- a/arch/x86/xen/Kconfig
+++ b/arch/x86/xen/Kconfig
@@ -55,6 +55,7 @@ config XEN_PRIVILEGED_GUEST
config XEN_PCI_PASSTHROUGH
bool #"Enable support for Xen PCI passthrough devices"
depends on XEN && PCI
+ select PCI_XEN
help
Enable support for passing PCI devices through to
unprivileged domains. (COMPLETELY UNTESTED)
@@ -62,3 +63,4 @@ config XEN_PCI_PASSTHROUGH
config XEN_DOM0_PCI
def_bool y
depends on XEN_DOM0 && PCI
+ select PCI_XEN
--
1.6.0.6

2009-03-13 17:04:16

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: [PATCH 05/27] x86/PCI: Clean up pci_cache_line_size

From: Alex Nixon <[email protected]>

Separate out x86 cache_line_size initialisation code into its own
function (so it can be shared by Xen later in this patch series)

Signed-off-by: Alex Nixon <[email protected]>
Signed-off-by: Jeremy Fitzhardinge <[email protected]>
---
arch/x86/include/asm/pci_x86.h | 1 +
arch/x86/pci/common.c | 17 +++++++++++------
2 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/pci_x86.h b/arch/x86/include/asm/pci_x86.h
index e60fd3e..5401ca2 100644
--- a/arch/x86/include/asm/pci_x86.h
+++ b/arch/x86/include/asm/pci_x86.h
@@ -45,6 +45,7 @@ enum pci_bf_sort_state {
extern unsigned int pcibios_max_latency;

void pcibios_resource_survey(void);
+void pcibios_set_cache_line_size(void);

/* pci-pc.c */

diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
index 82d22fc..081ebd5 100644
--- a/arch/x86/pci/common.c
+++ b/arch/x86/pci/common.c
@@ -409,26 +409,31 @@ struct pci_bus * __devinit pcibios_scan_root(int busnum)

extern u8 pci_cache_line_size;

-int __init pcibios_init(void)
+void pcibios_set_cache_line_size(void)
{
struct cpuinfo_x86 *c = &boot_cpu_data;

- if (!raw_pci_ops) {
- printk(KERN_WARNING "PCI: System does not support PCI\n");
- return 0;
- }
-
/*
* Assume PCI cacheline size of 32 bytes for all x86s except K7/K8
* and P4. It's also good for 386/486s (which actually have 16)
* as quite a few PCI devices do not support smaller values.
*/
+
pci_cache_line_size = 32 >> 2;
if (c->x86 >= 6 && c->x86_vendor == X86_VENDOR_AMD)
pci_cache_line_size = 64 >> 2; /* K7 & K8 */
else if (c->x86 > 6 && c->x86_vendor == X86_VENDOR_INTEL)
pci_cache_line_size = 128 >> 2; /* P4 */
+}
+
+int __init pcibios_init(void)
+{
+ if (!raw_pci_ops) {
+ printk(KERN_WARNING "PCI: System does not support PCI\n");
+ return 0;
+ }

+ pcibios_set_cache_line_size();
pcibios_resource_survey();

if (pci_bf_sort >= pci_force_bf)
--
1.6.0.6

2009-03-13 17:03:51

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: [PATCH 11/27] xen swiotlb: fixup swiotlb is chunks smaller than MAX_CONTIG_ORDER

From: Ian Campbell <[email protected]>

Don't attempt to make larger memory ranges than Xen can cope with
contiguous.

Signed-off-by: Ian Campbell <[email protected]>
Signed-off-by: Jeremy Fitzhardinge <[email protected]>
---
arch/x86/kernel/pci-swiotlb.c | 7 +------
drivers/pci/xen-iommu.c | 27 ++++++++++++++++++---------
2 files changed, 19 insertions(+), 15 deletions(-)

diff --git a/arch/x86/kernel/pci-swiotlb.c b/arch/x86/kernel/pci-swiotlb.c
index a7b9410..298f01d 100644
--- a/arch/x86/kernel/pci-swiotlb.c
+++ b/arch/x86/kernel/pci-swiotlb.c
@@ -28,12 +28,7 @@ void * __init swiotlb_alloc_boot(size_t size, unsigned long nslabs)

void *swiotlb_alloc(unsigned order, unsigned long nslabs)
{
- void *ret = (void *)__get_free_pages(GFP_DMA | __GFP_NOWARN, order);
-
- if (ret && xen_pv_domain())
- xen_swiotlb_fixup(ret, 1u << order, nslabs);
-
- return ret;
+ BUG();
}

dma_addr_t swiotlb_phys_to_bus(struct device *hwdev, phys_addr_t paddr)
diff --git a/drivers/pci/xen-iommu.c b/drivers/pci/xen-iommu.c
index dc0da76..d546698 100644
--- a/drivers/pci/xen-iommu.c
+++ b/drivers/pci/xen-iommu.c
@@ -5,6 +5,7 @@
#include <linux/module.h>
#include <linux/version.h>
#include <linux/scatterlist.h>
+#include <linux/swiotlb.h>
#include <linux/io.h>
#include <linux/bug.h>

@@ -43,19 +44,27 @@ struct dma_coherent_mem {
unsigned long *bitmap;
};

+static int max_dma_bits = 32;
+
void xen_swiotlb_fixup(void *buf, size_t size, unsigned long nslabs)
{
- unsigned order = get_order(size);
-
- printk(KERN_DEBUG "xen_swiotlb_fixup: buf=%p size=%zu order=%u\n",
- buf, size, order);
-
- if (WARN_ON(size != (PAGE_SIZE << order)))
- return;
-
- if (xen_create_contiguous_region((unsigned long)buf,
- order, 0xffffffff))
- printk(KERN_ERR "xen_create_contiguous_region failed\n");
+ int i, rc;
+ int dma_bits;
+
+ printk(KERN_DEBUG "xen_swiotlb_fixup: buf=%p size=%zu\n",
+ buf, size);
+
+ dma_bits = get_order(IO_TLB_SEGSIZE << IO_TLB_SHIFT) + PAGE_SHIFT;
+ for (i = 0; i < nslabs; i += IO_TLB_SEGSIZE) {
+ do {
+ rc = xen_create_contiguous_region(
+ (unsigned long)buf + (i << IO_TLB_SHIFT),
+ get_order(IO_TLB_SEGSIZE << IO_TLB_SHIFT),
+ dma_bits);
+ } while (rc && dma_bits++ < max_dma_bits);
+ if (rc)
+ panic(KERN_ERR "xen_create_contiguous_region failed\n");
+ }
}

static inline int address_needs_mapping(struct device *hwdev,
--
1.6.0.6

2009-03-13 17:05:19

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: [PATCH 14/27] xen: enable swiotlb for xen domain 0.

From: Ian Campbell <[email protected]>

Signed-off-by: Ian Campbell <[email protected]>
---
arch/x86/kernel/pci-swiotlb.c | 4 ++++
arch/x86/xen/Kconfig | 1 +
drivers/pci/xen-iommu.c | 5 +++++
include/xen/swiotlb.h | 2 ++
4 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kernel/pci-swiotlb.c b/arch/x86/kernel/pci-swiotlb.c
index 5826be8..6d3ba02 100644
--- a/arch/x86/kernel/pci-swiotlb.c
+++ b/arch/x86/kernel/pci-swiotlb.c
@@ -94,6 +94,10 @@ void __init pci_swiotlb_init(void)
if (!iommu_detected && !no_iommu && max_pfn > MAX_DMA32_PFN)
swiotlb = 1;
#endif
+#ifdef CONFIG_PCI_XEN
+ if (xen_wants_swiotlb())
+ swiotlb = 1;
+#endif
if (swiotlb_force)
swiotlb = 1;
if (swiotlb) {
diff --git a/arch/x86/xen/Kconfig b/arch/x86/xen/Kconfig
index 87c13db..2c85967 100644
--- a/arch/x86/xen/Kconfig
+++ b/arch/x86/xen/Kconfig
@@ -6,6 +6,7 @@ config XEN
bool "Xen guest support"
select PARAVIRT
select PARAVIRT_CLOCK
+ select SWIOTLB
depends on X86_64 || (X86_32 && X86_PAE && !X86_VISWS)
depends on X86_CMPXCHG && X86_TSC
help
diff --git a/drivers/pci/xen-iommu.c b/drivers/pci/xen-iommu.c
index 0995ddf..80bb7ed 100644
--- a/drivers/pci/xen-iommu.c
+++ b/drivers/pci/xen-iommu.c
@@ -67,6 +67,11 @@ void xen_swiotlb_fixup(void *buf, size_t size, unsigned long nslabs)
}
}

+int xen_wants_swiotlb(void)
+{
+ return xen_initial_domain();
+}
+
dma_addr_t xen_phys_to_bus(phys_addr_t paddr)
{
return phys_to_machine(XPADDR(paddr)).maddr;
diff --git a/include/xen/swiotlb.h b/include/xen/swiotlb.h
index e15caa8..defac5c 100644
--- a/include/xen/swiotlb.h
+++ b/include/xen/swiotlb.h
@@ -9,6 +9,8 @@ static inline void xen_swiotlb_fixup(void *buf, size_t size, unsigned long nslab
}
#endif

+extern int xen_wants_swiotlb(void);
+
extern phys_addr_t xen_bus_to_phys(dma_addr_t daddr);
extern dma_addr_t xen_phys_to_bus(phys_addr_t paddr);

--
1.6.0.6

2009-03-13 17:05:54

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: [PATCH 17/27] xen/swiotlb: improve comment on gfp flags in xen_alloc_coherent()

From: Jeremy Fitzhardinge <[email protected]>

Clarify why we don't care about the kernel's pseudo-phys restrictions,
so long as the underlying pages are in the right place.

Signed-off-by: Jeremy Fitzhardinge <[email protected]>
---
drivers/pci/xen-iommu.c | 10 ++++++----
1 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/pci/xen-iommu.c b/drivers/pci/xen-iommu.c
index 7b6fd3e..13cfe0c 100644
--- a/drivers/pci/xen-iommu.c
+++ b/drivers/pci/xen-iommu.c
@@ -202,15 +202,17 @@ static void *xen_alloc_coherent(struct device *dev, size_t size,
unsigned long vstart;
u64 mask;

- /* ignore region specifiers */
+ /*
+ * Ignore region specifiers - the kernel's ideas of
+ * pseudo-phys memory layout has nothing to do with the
+ * machine physical layout. We can't allocate highmem
+ * because we can't return a pointer to it.
+ */
gfp &= ~(__GFP_DMA | __GFP_HIGHMEM);

if (dma_alloc_from_coherent(dev, size, dma_handle, &ret))
return ret;

- if (dev == NULL || (dev->coherent_dma_mask < 0xffffffff))
- gfp |= GFP_DMA;
-
vstart = __get_free_pages(gfp, order);
ret = (void *)vstart;

--
1.6.0.6

2009-03-13 17:05:39

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: [PATCH 15/27] xen/swiotlb: use dma_alloc_from_coherent to get device coherent memory

From: Jeremy Fitzhardinge <[email protected]>

This avoids the previous hack of replicating struct dma_coherent_mem.

Signed-off-by: Jeremy Fitzhardinge <[email protected]>
---
drivers/pci/xen-iommu.c | 39 ++++++++-------------------------------
1 files changed, 8 insertions(+), 31 deletions(-)

diff --git a/drivers/pci/xen-iommu.c b/drivers/pci/xen-iommu.c
index 80bb7ed..608b8e2 100644
--- a/drivers/pci/xen-iommu.c
+++ b/drivers/pci/xen-iommu.c
@@ -36,14 +36,6 @@ do { \
(unsigned long long)addr + size); \
} while (0)

-struct dma_coherent_mem {
- void *virt_base;
- u32 device_base;
- int size;
- int flags;
- unsigned long *bitmap;
-};
-
static int max_dma_bits = 32;

void xen_swiotlb_fixup(void *buf, size_t size, unsigned long nslabs)
@@ -200,7 +192,6 @@ static void *xen_alloc_coherent(struct device *dev, size_t size,
dma_addr_t *dma_handle, gfp_t gfp)
{
void *ret;
- struct dma_coherent_mem *mem = dev ? dev->dma_mem : NULL;
unsigned int order = get_order(size);
unsigned long vstart;
u64 mask;
@@ -208,18 +199,8 @@ static void *xen_alloc_coherent(struct device *dev, size_t size,
/* ignore region specifiers */
gfp &= ~(__GFP_DMA | __GFP_HIGHMEM);

- if (mem) {
- int page = bitmap_find_free_region(mem->bitmap, mem->size,
- order);
- if (page >= 0) {
- *dma_handle = mem->device_base + (page << PAGE_SHIFT);
- ret = mem->virt_base + (page << PAGE_SHIFT);
- memset(ret, 0, size);
- return ret;
- }
- if (mem->flags & DMA_MEMORY_EXCLUSIVE)
- return NULL;
- }
+ if (dma_alloc_from_coherent(dev, size, dma_handle, &ret))
+ return ret;

if (dev == NULL || (dev->coherent_dma_mask < 0xffffffff))
gfp |= GFP_DMA;
@@ -245,19 +226,15 @@ static void *xen_alloc_coherent(struct device *dev, size_t size,
}

static void xen_free_coherent(struct device *dev, size_t size,
- void *vaddr, dma_addr_t dma_addr)
+ void *vaddr, dma_addr_t dma_addr)
{
- struct dma_coherent_mem *mem = dev ? dev->dma_mem : NULL;
int order = get_order(size);

- if (mem && vaddr >= mem->virt_base &&
- vaddr < (mem->virt_base + (mem->size << PAGE_SHIFT))) {
- int page = (vaddr - mem->virt_base) >> PAGE_SHIFT;
- bitmap_release_region(mem->bitmap, page, order);
- } else {
- xen_destroy_contiguous_region((unsigned long)vaddr, order);
- free_pages((unsigned long)vaddr, order);
- }
+ if (dma_release_from_coherent(dev, order, vaddr))
+ return;
+
+ xen_destroy_contiguous_region((unsigned long)vaddr, order);
+ free_pages((unsigned long)vaddr, order);
}

static dma_addr_t xen_map_page(struct device *dev, struct page *page,
--
1.6.0.6

2009-03-13 17:06:37

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: [PATCH 20/27] xen mtrr: Use specific cpu_has_foo macros instead of generic cpu_has()

From: Mark McLoughlin <[email protected]>

Signed-off-by: Mark McLoughlin <[email protected]>
---
arch/x86/kernel/cpu/mtrr/xen.c | 10 ++++------
1 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kernel/cpu/mtrr/xen.c b/arch/x86/kernel/cpu/mtrr/xen.c
index db3ef39..e03532c 100644
--- a/arch/x86/kernel/cpu/mtrr/xen.c
+++ b/arch/x86/kernel/cpu/mtrr/xen.c
@@ -44,15 +44,13 @@ static int __init xen_num_var_ranges(void)

void __init xen_init_mtrr(void)
{
- struct cpuinfo_x86 *c = &boot_cpu_data;
-
if (!xen_initial_domain())
return;

- if ((!cpu_has(c, X86_FEATURE_MTRR)) &&
- (!cpu_has(c, X86_FEATURE_K6_MTRR)) &&
- (!cpu_has(c, X86_FEATURE_CYRIX_ARR)) &&
- (!cpu_has(c, X86_FEATURE_CENTAUR_MCR)))
+ if (!cpu_has_mtrr &&
+ !cpu_has_k6_mtrr &&
+ !cpu_has_cyrix_arr &&
+ !cpu_has_centaur_mcr)
return;

mtrr_if = &xen_mtrr_ops;
--
1.6.0.6

2009-03-13 17:06:20

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: [PATCH 13/27] xen/swiotlb: add hook for swiotlb_arch_range_needs_mapping

From: Ian Campbell <[email protected]>

Add hook so that Xen can determine whether a particular address range needs
pfn<->mfn mapping.

Signed-off-by: Ian Campbell <[email protected]>
Signed-off-by: Jeremy Fitzhardinge <[email protected]>
---
arch/x86/kernel/pci-swiotlb.c | 4 ++++
drivers/pci/xen-iommu.c | 5 +++++
include/xen/swiotlb.h | 2 ++
3 files changed, 11 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kernel/pci-swiotlb.c b/arch/x86/kernel/pci-swiotlb.c
index 0943813..5826be8 100644
--- a/arch/x86/kernel/pci-swiotlb.c
+++ b/arch/x86/kernel/pci-swiotlb.c
@@ -51,6 +51,10 @@ phys_addr_t swiotlb_bus_to_phys(dma_addr_t baddr)

int __weak swiotlb_arch_range_needs_mapping(phys_addr_t paddr, size_t size)
{
+#ifdef CONFIG_PCI_XEN
+ if (xen_pv_domain())
+ return xen_range_needs_mapping(paddr, size);
+#endif
return 0;
}

diff --git a/drivers/pci/xen-iommu.c b/drivers/pci/xen-iommu.c
index e7ba06b..0995ddf 100644
--- a/drivers/pci/xen-iommu.c
+++ b/drivers/pci/xen-iommu.c
@@ -126,6 +126,11 @@ static int range_straddles_page_boundary(phys_addr_t p, size_t size)
return 1;
}

+int xen_range_needs_mapping(phys_addr_t paddr, size_t size)
+{
+ return range_straddles_page_boundary(paddr, size);
+}
+
static inline void xen_dma_unmap_page(struct page *page)
{
/* Xen TODO: 2.6.18 xen calls __gnttab_dma_unmap_page here
diff --git a/include/xen/swiotlb.h b/include/xen/swiotlb.h
index 9ecaff1..e15caa8 100644
--- a/include/xen/swiotlb.h
+++ b/include/xen/swiotlb.h
@@ -12,4 +12,6 @@ static inline void xen_swiotlb_fixup(void *buf, size_t size, unsigned long nslab
extern phys_addr_t xen_bus_to_phys(dma_addr_t daddr);
extern dma_addr_t xen_phys_to_bus(phys_addr_t paddr);

+extern int xen_range_needs_mapping(phys_addr_t phys, size_t size);
+
#endif /* _XEN_SWIOTLB_H */
--
1.6.0.6

2009-03-13 17:06:53

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: [PATCH 25/27] x86: define arch_vm_get_page_prot to set _PAGE_IOMAP on VM_IO vmas

From: Jeremy Fitzhardinge <[email protected]>

Set _PAGE_IOMAP in ptes mapping a VM_IO vma. This says that the mapping
is of a real piece of physical hardware, and not just system memory.

Xen, in particular, uses to this to inhibit the normal pfn->mfn conversion
that would normally happen - in other words, treat the address directly
as a machine physical address without converting it from pseudo-physical.

Signed-off-by: Jeremy Fitzhardinge <[email protected]>
Cc: David Airlie <[email protected]>
---
arch/x86/include/asm/pgtable.h | 3 +++
arch/x86/mm/pgtable.c | 10 ++++++++++
2 files changed, 13 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index d37e55e..d7cbfaa 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -391,6 +391,9 @@ static inline unsigned long pages_to_mb(unsigned long npg)
#define io_remap_pfn_range(vma, vaddr, pfn, size, prot) \
remap_pfn_range(vma, vaddr, pfn, size, prot)

+#define arch_vm_get_page_prot arch_vm_get_page_prot
+extern pgprot_t arch_vm_get_page_prot(unsigned vm_flags);
+
#if PAGETABLE_LEVELS > 2
static inline int pud_none(pud_t pud)
{
diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
index 7a4d6ee..d9da313 100644
--- a/arch/x86/mm/pgtable.c
+++ b/arch/x86/mm/pgtable.c
@@ -6,6 +6,16 @@

#define PGALLOC_GFP GFP_KERNEL | __GFP_NOTRACK | __GFP_REPEAT | __GFP_ZERO

+pgprot_t arch_vm_get_page_prot(unsigned vm_flags)
+{
+ pgprot_t ret = __pgprot(0);
+
+ if (vm_flags & VM_IO)
+ ret = __pgprot(_PAGE_IOMAP);
+
+ return ret;
+}
+
pte_t *pte_alloc_one_kernel(struct mm_struct *mm, unsigned long address)
{
return (pte_t *)__get_free_page(PGALLOC_GFP);
--
1.6.0.6

2009-03-13 17:08:01

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: [PATCH 21/27] xen mtrr: Use generic_validate_add_page()

From: Mark McLoughlin <[email protected]>

The hypervisor already performs the same validation, but
better to do it early before getting to the range combining
code.

Signed-off-by: Mark McLoughlin <[email protected]>
---
arch/x86/kernel/cpu/mtrr/xen.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kernel/cpu/mtrr/xen.c b/arch/x86/kernel/cpu/mtrr/xen.c
index e03532c..622d075 100644
--- a/arch/x86/kernel/cpu/mtrr/xen.c
+++ b/arch/x86/kernel/cpu/mtrr/xen.c
@@ -22,7 +22,7 @@ static struct mtrr_ops xen_mtrr_ops = {
// .set = xen_set_mtrr,
// .get = xen_get_mtrr,
.get_free_region = generic_get_free_region,
-// .validate_add_page = xen_validate_add_page,
+ .validate_add_page = generic_validate_add_page,
.have_wrcomb = positive_have_wrcomb,
.use_intel_if = 0,
.num_var_ranges = xen_num_var_ranges,
--
1.6.0.6

2009-03-13 17:08:45

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: [PATCH 26/27] agp: use more dma-ops-like operations for agp memory

From: Jeremy Fitzhardinge <[email protected]>

When using AGP under Xen, we need to be careful to
1) properly translate between physical and machine addresses, and
2) make sure memory is physically contigious when the hardware expects it

This change uses swiotlb_phys_to_bus/bus_to_phys to do the phys<->gart
conversion, since they already do the right thing, and dma_alloc_coherent
for gatt allocations. This should work equally well running native.

Signed-off-by: Jeremy Fitzhardinge <[email protected]>
Cc: David Airlie <[email protected]>
---
arch/x86/include/asm/agp.h | 15 ++++++++++-----
lib/swiotlb.c | 2 ++
2 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/agp.h b/arch/x86/include/asm/agp.h
index 9825cd6..7ba2639 100644
--- a/arch/x86/include/asm/agp.h
+++ b/arch/x86/include/asm/agp.h
@@ -1,8 +1,11 @@
#ifndef _ASM_X86_AGP_H
#define _ASM_X86_AGP_H

+#include <linux/swiotlb.h>
+
#include <asm/pgtable.h>
#include <asm/cacheflush.h>
+#include <asm/dma-mapping.h>

/*
* Functions to keep the agpgart mappings coherent with the MMU. The
@@ -23,13 +26,15 @@
#define flush_agp_cache() wbinvd()

/* Convert a physical address to an address suitable for the GART. */
-#define phys_to_gart(x) (x)
-#define gart_to_phys(x) (x)
+#define phys_to_gart(x) swiotlb_phys_to_bus(NULL, (x))
+#define gart_to_phys(x) swiotlb_bus_to_phys(x)

/* GATT allocation. Returns/accepts GATT kernel virtual address. */
-#define alloc_gatt_pages(order) \
- ((char *)__get_free_pages(GFP_KERNEL, (order)))
+#define alloc_gatt_pages(order) ({ \
+ char *_t; dma_addr_t _d; \
+ _t = dma_alloc_coherent(NULL,PAGE_SIZE<<(order),&_d,GFP_KERNEL); \
+ _t; })
#define free_gatt_pages(table, order) \
- free_pages((unsigned long)(table), (order))
+ dma_free_coherent(NULL,PAGE_SIZE<<(order),(table),virt_to_bus(table))

#endif /* _ASM_X86_AGP_H */
diff --git a/lib/swiotlb.c b/lib/swiotlb.c
index 8e6f6c8..98fb7d3 100644
--- a/lib/swiotlb.c
+++ b/lib/swiotlb.c
@@ -128,11 +128,13 @@ dma_addr_t __weak swiotlb_phys_to_bus(struct device *hwdev, phys_addr_t paddr)
{
return paddr;
}
+EXPORT_SYMBOL_GPL(swiotlb_phys_to_bus);

phys_addr_t __weak swiotlb_bus_to_phys(dma_addr_t baddr)
{
return baddr;
}
+EXPORT_SYMBOL_GPL(swiotlb_bus_to_phys);

static dma_addr_t swiotlb_virt_to_bus(struct device *hwdev,
volatile void *address)
--
1.6.0.6

2009-03-13 17:08:28

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: [PATCH 22/27] xen mtrr: Implement xen_get_free_region()

From: Mark McLoughlin <[email protected]>

When an already set MTRR is being changed, we need to
first unset, since Xen also maintains a usage count.

Signed-off-by: Mark McLoughlin <[email protected]>
---
arch/x86/kernel/cpu/mtrr/mtrr.h | 2 ++
arch/x86/kernel/cpu/mtrr/xen.c | 27 ++++++++++++++++++++++++++-
2 files changed, 28 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kernel/cpu/mtrr/mtrr.h b/arch/x86/kernel/cpu/mtrr/mtrr.h
index eb23ca2..6142d6e 100644
--- a/arch/x86/kernel/cpu/mtrr/mtrr.h
+++ b/arch/x86/kernel/cpu/mtrr/mtrr.h
@@ -5,6 +5,8 @@
#include <linux/types.h>
#include <linux/stddef.h>

+#include <asm/mtrr.h>
+
#define MTRRcap_MSR 0x0fe
#define MTRRdefType_MSR 0x2ff

diff --git a/arch/x86/kernel/cpu/mtrr/xen.c b/arch/x86/kernel/cpu/mtrr/xen.c
index 622d075..1ffb93c 100644
--- a/arch/x86/kernel/cpu/mtrr/xen.c
+++ b/arch/x86/kernel/cpu/mtrr/xen.c
@@ -15,13 +15,38 @@

static int __init xen_num_var_ranges(void);

+static int xen_get_free_region(unsigned long base, unsigned long size, int replace_reg)
+{
+ struct xen_platform_op op;
+ int error;
+
+ if (replace_reg < 0)
+ return generic_get_free_region(base, size, -1);
+
+ /* If we're replacing the contents of a register,
+ * we need to first unset it since Xen also keeps
+ * a usage count.
+ */
+ op.cmd = XENPF_del_memtype;
+ op.u.del_memtype.handle = 0;
+ op.u.del_memtype.reg = replace_reg;
+
+ error = HYPERVISOR_dom0_op(&op);
+ if (error) {
+ BUG_ON(error > 0);
+ return error;
+ }
+
+ return replace_reg;
+}
+
/* DOM0 TODO: Need to fill in the remaining mtrr methods to have full
* working userland mtrr support. */
static struct mtrr_ops xen_mtrr_ops = {
.vendor = X86_VENDOR_UNKNOWN,
// .set = xen_set_mtrr,
// .get = xen_get_mtrr,
- .get_free_region = generic_get_free_region,
+ .get_free_region = xen_get_free_region,
.validate_add_page = generic_validate_add_page,
.have_wrcomb = positive_have_wrcomb,
.use_intel_if = 0,
--
1.6.0.6

2009-03-13 17:07:17

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: [PATCH 23/27] xen mtrr: Add xen_{get,set}_mtrr() implementations

From: Mark McLoughlin <[email protected]>

Straightforward apart from the hack to turn mtrr_ops->set()
into a no-op on all but one CPU.

Signed-off-by: Mark McLoughlin <[email protected]>
---
arch/x86/kernel/cpu/mtrr/xen.c | 52 ++++++++++++++++++++++++++++++++++++---
1 files changed, 48 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/cpu/mtrr/xen.c b/arch/x86/kernel/cpu/mtrr/xen.c
index 1ffb93c..184066d 100644
--- a/arch/x86/kernel/cpu/mtrr/xen.c
+++ b/arch/x86/kernel/cpu/mtrr/xen.c
@@ -15,6 +15,52 @@

static int __init xen_num_var_ranges(void);

+static void xen_set_mtrr(unsigned int reg, unsigned long base,
+ unsigned long size, mtrr_type type)
+{
+ struct xen_platform_op op;
+ int error;
+
+ /* mtrr_ops->set() is called once per CPU,
+ * but Xen's ops apply to all CPUs.
+ */
+ if (smp_processor_id())
+ return;
+
+ if (size == 0) {
+ op.cmd = XENPF_del_memtype;
+ op.u.del_memtype.handle = 0;
+ op.u.del_memtype.reg = reg;
+ } else {
+ op.cmd = XENPF_add_memtype;
+ op.u.add_memtype.mfn = base;
+ op.u.add_memtype.nr_mfns = size;
+ op.u.add_memtype.type = type;
+ }
+
+ error = HYPERVISOR_dom0_op(&op);
+ BUG_ON(error != 0);
+}
+
+static void xen_get_mtrr(unsigned int reg, unsigned long *base,
+ unsigned long *size, mtrr_type *type)
+{
+ struct xen_platform_op op;
+
+ op.cmd = XENPF_read_memtype;
+ op.u.read_memtype.reg = reg;
+ if (HYPERVISOR_dom0_op(&op) != 0) {
+ *base = 0;
+ *size = 0;
+ *type = 0;
+ return;
+ }
+
+ *size = op.u.read_memtype.nr_mfns;
+ *base = op.u.read_memtype.mfn;
+ *type = op.u.read_memtype.type;
+}
+
static int xen_get_free_region(unsigned long base, unsigned long size, int replace_reg)
{
struct xen_platform_op op;
@@ -40,12 +86,10 @@ static int xen_get_free_region(unsigned long base, unsigned long size, int repla
return replace_reg;
}

-/* DOM0 TODO: Need to fill in the remaining mtrr methods to have full
- * working userland mtrr support. */
static struct mtrr_ops xen_mtrr_ops = {
.vendor = X86_VENDOR_UNKNOWN,
-// .set = xen_set_mtrr,
-// .get = xen_get_mtrr,
+ .set = xen_set_mtrr,
+ .get = xen_get_mtrr,
.get_free_region = xen_get_free_region,
.validate_add_page = generic_validate_add_page,
.have_wrcomb = positive_have_wrcomb,
--
1.6.0.6

2009-03-13 17:07:46

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: [PATCH 24/27] xen mtrr: Kill some unneccessary includes

From: Mark McLoughlin <[email protected]>

Signed-off-by: Mark McLoughlin <[email protected]>
---
arch/x86/kernel/cpu/mtrr/xen.c | 8 +-------
1 files changed, 1 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kernel/cpu/mtrr/xen.c b/arch/x86/kernel/cpu/mtrr/xen.c
index 184066d..50a45db 100644
--- a/arch/x86/kernel/cpu/mtrr/xen.c
+++ b/arch/x86/kernel/cpu/mtrr/xen.c
@@ -1,12 +1,6 @@
#include <linux/init.h>
-#include <linux/proc_fs.h>
-#include <linux/ctype.h>
-#include <linux/module.h>
-#include <linux/seq_file.h>
-#include <asm/uaccess.h>
-#include <linux/mutex.h>
-
-#include <asm/mtrr.h>
+#include <linux/mm.h>
+
#include "mtrr.h"

#include <xen/interface/platform.h>
--
1.6.0.6

2009-03-13 17:09:08

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: [PATCH 19/27] xen: set callout_map to make mtrr work

From: Jeremy Fitzhardinge <[email protected]>

It needs it for its cross-cpu mtrr setup handshake.

Signed-off-by: Jeremy Fitzhardinge <[email protected]>
---
arch/x86/xen/smp.c | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c
index 304d832..dc9cd85 100644
--- a/arch/x86/xen/smp.c
+++ b/arch/x86/xen/smp.c
@@ -210,6 +210,7 @@ static void __init xen_smp_prepare_cpus(unsigned int max_cpus)
if (IS_ERR(idle))
panic("failed fork for CPU %d", cpu);

+ cpumask_set_cpu(cpu, cpu_callout_mask);
cpu_set(cpu, cpu_present_map);
}
}
--
1.6.0.6

2009-03-13 17:09:43

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: [PATCH 07/27] Xen/x86/PCI: Add support for the Xen PCI subsytem

From: Alex Nixon <[email protected]>

On boot, the system will search to see if a Xen iommu/pci subsystem is
available. If the kernel detects it's running in a domain rather than
on bare hardware, this subsystem will be used. Otherwise, it falls
back to using hardware as usual.

The frontend stub lives in arch/x86/pci-xen.c, alongside other
sub-arch PCI init code (e.g. olpc.c)

(All subsequent fixes, API changes and swiotlb operations folded in.)

Signed-off-by: Alex Nixon <[email protected]>
Signed-off-by: Jeremy Fitzhardinge <[email protected]>
Signed-off-by: Ian Campbell <[email protected]>
---
arch/x86/Kconfig | 4 +
arch/x86/include/asm/pci_x86.h | 1 +
arch/x86/include/asm/xen/iommu.h | 12 ++
arch/x86/kernel/pci-dma.c | 3 +
arch/x86/pci/Makefile | 1 +
arch/x86/pci/init.c | 6 +
arch/x86/pci/xen.c | 52 +++++++
drivers/pci/Makefile | 2 +
drivers/pci/xen-iommu.c | 294 ++++++++++++++++++++++++++++++++++++++
9 files changed, 375 insertions(+), 0 deletions(-)
create mode 100644 arch/x86/include/asm/xen/iommu.h
create mode 100644 arch/x86/pci/xen.c
create mode 100644 drivers/pci/xen-iommu.c

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 15ec8a2..9092750 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1828,6 +1828,10 @@ config PCI_OLPC
def_bool y
depends on PCI && OLPC && (PCI_GOOLPC || PCI_GOANY)

+config PCI_XEN
+ def_bool y
+ depends on XEN_PCI_PASSTHROUGH || XEN_DOM0_PCI
+
config PCI_DOMAINS
def_bool y
depends on PCI
diff --git a/arch/x86/include/asm/pci_x86.h b/arch/x86/include/asm/pci_x86.h
index 5401ca2..34f03a4 100644
--- a/arch/x86/include/asm/pci_x86.h
+++ b/arch/x86/include/asm/pci_x86.h
@@ -107,6 +107,7 @@ extern int pci_direct_probe(void);
extern void pci_direct_init(int type);
extern void pci_pcbios_init(void);
extern int pci_olpc_init(void);
+extern int pci_xen_init(void);
extern void __init dmi_check_pciprobe(void);
extern void __init dmi_check_skip_isa_align(void);

diff --git a/arch/x86/include/asm/xen/iommu.h b/arch/x86/include/asm/xen/iommu.h
new file mode 100644
index 0000000..75df312
--- /dev/null
+++ b/arch/x86/include/asm/xen/iommu.h
@@ -0,0 +1,12 @@
+#ifndef ASM_X86__XEN_IOMMU_H
+
+#ifdef CONFIG_PCI_XEN
+extern void xen_iommu_init(void);
+#else
+static inline void xen_iommu_init(void)
+{
+}
+#endif
+
+#endif
+
diff --git a/arch/x86/kernel/pci-dma.c b/arch/x86/kernel/pci-dma.c
index f293a8d..361fde2 100644
--- a/arch/x86/kernel/pci-dma.c
+++ b/arch/x86/kernel/pci-dma.c
@@ -9,6 +9,7 @@
#include <asm/gart.h>
#include <asm/calgary.h>
#include <asm/amd_iommu.h>
+#include <asm/xen/iommu.h>

static int forbid_dac __read_mostly;

@@ -265,6 +266,8 @@ EXPORT_SYMBOL(dma_supported);

static int __init pci_iommu_init(void)
{
+ xen_iommu_init();
+
calgary_iommu_init();

intel_iommu_init();
diff --git a/arch/x86/pci/Makefile b/arch/x86/pci/Makefile
index d49202e..64182c5 100644
--- a/arch/x86/pci/Makefile
+++ b/arch/x86/pci/Makefile
@@ -4,6 +4,7 @@ obj-$(CONFIG_PCI_BIOS) += pcbios.o
obj-$(CONFIG_PCI_MMCONFIG) += mmconfig_$(BITS).o direct.o mmconfig-shared.o
obj-$(CONFIG_PCI_DIRECT) += direct.o
obj-$(CONFIG_PCI_OLPC) += olpc.o
+obj-$(CONFIG_PCI_XEN) += xen.o

obj-y += fixup.o
obj-$(CONFIG_ACPI) += acpi.o
diff --git a/arch/x86/pci/init.c b/arch/x86/pci/init.c
index 25a1f8e..4e2f90a 100644
--- a/arch/x86/pci/init.c
+++ b/arch/x86/pci/init.c
@@ -15,10 +15,16 @@ static __init int pci_arch_init(void)
if (!(pci_probe & PCI_PROBE_NOEARLY))
pci_mmcfg_early_init();

+#ifdef CONFIG_PCI_XEN
+ if (!pci_xen_init())
+ return 0;
+#endif
+
#ifdef CONFIG_PCI_OLPC
if (!pci_olpc_init())
return 0; /* skip additional checks if it's an XO */
#endif
+
#ifdef CONFIG_PCI_BIOS
pci_pcbios_init();
#endif
diff --git a/arch/x86/pci/xen.c b/arch/x86/pci/xen.c
new file mode 100644
index 0000000..76f803f
--- /dev/null
+++ b/arch/x86/pci/xen.c
@@ -0,0 +1,52 @@
+/*
+ * Xen PCI Frontend Stub - puts some "dummy" functions in to the Linux
+ * x86 PCI core to support the Xen PCI Frontend
+ *
+ * Author: Ryan Wilson <[email protected]>
+ */
+#include <linux/module.h>
+#include <linux/init.h>
+#include <linux/pci.h>
+#include <linux/acpi.h>
+
+#include <asm/pci_x86.h>
+
+#include <asm/xen/hypervisor.h>
+
+static int xen_pcifront_enable_irq(struct pci_dev *dev)
+{
+ return 0;
+}
+
+extern int isapnp_disable;
+
+int __init pci_xen_init(void)
+{
+ if (!xen_pv_domain() || xen_initial_domain())
+ return -ENODEV;
+
+ printk(KERN_INFO "PCI: setting up Xen PCI frontend stub\n");
+
+ pcibios_set_cache_line_size();
+
+ pcibios_enable_irq = xen_pcifront_enable_irq;
+ pcibios_disable_irq = NULL;
+
+#ifdef CONFIG_ACPI
+ /* Keep ACPI out of the picture */
+ acpi_noirq = 1;
+#endif
+
+#ifdef CONFIG_ISAPNP
+ /* Stop isapnp from probing */
+ isapnp_disable = 1;
+#endif
+
+ /* Ensure a device still gets scanned even if it's fn number
+ * is non-zero.
+ */
+ pci_scan_all_fns = 1;
+
+ return 0;
+}
+
diff --git a/drivers/pci/Makefile b/drivers/pci/Makefile
index 3d07ce2..106404e 100644
--- a/drivers/pci/Makefile
+++ b/drivers/pci/Makefile
@@ -27,6 +27,8 @@ obj-$(CONFIG_HT_IRQ) += htirq.o
# Build Intel IOMMU support
obj-$(CONFIG_DMAR) += dmar.o iova.o intel-iommu.o

+# Build Xen IOMMU support
+obj-$(CONFIG_PCI_XEN) += xen-iommu.o
obj-$(CONFIG_INTR_REMAP) += dmar.o intr_remapping.o

#
diff --git a/drivers/pci/xen-iommu.c b/drivers/pci/xen-iommu.c
new file mode 100644
index 0000000..5b701e8
--- /dev/null
+++ b/drivers/pci/xen-iommu.c
@@ -0,0 +1,294 @@
+#include <linux/types.h>
+#include <linux/mm.h>
+#include <linux/string.h>
+#include <linux/pci.h>
+#include <linux/module.h>
+#include <linux/version.h>
+#include <linux/scatterlist.h>
+#include <linux/io.h>
+#include <linux/bug.h>
+
+#include <xen/interface/xen.h>
+#include <xen/grant_table.h>
+#include <xen/page.h>
+#include <xen/xen-ops.h>
+
+#include <asm/iommu.h>
+#include <asm/swiotlb.h>
+#include <asm/tlbflush.h>
+
+#define IOMMU_BUG_ON(test) \
+do { \
+ if (unlikely(test)) { \
+ printk(KERN_ALERT "Fatal DMA error! " \
+ "Please use 'swiotlb=force'\n"); \
+ BUG(); \
+ } \
+} while (0)
+
+/* Print address range with message */
+#define PAR(msg, addr, size) \
+do { \
+ printk(msg "[%#llx - %#llx]\n", \
+ (unsigned long long)addr, \
+ (unsigned long long)addr + size); \
+} while (0)
+
+struct dma_coherent_mem {
+ void *virt_base;
+ u32 device_base;
+ int size;
+ int flags;
+ unsigned long *bitmap;
+};
+
+static inline int address_needs_mapping(struct device *hwdev,
+ dma_addr_t addr)
+{
+ dma_addr_t mask = 0xffffffff;
+ int ret;
+
+ /* If the device has a mask, use it, otherwise default to 32 bits */
+ if (hwdev && hwdev->dma_mask)
+ mask = *hwdev->dma_mask;
+
+ ret = (addr & ~mask) != 0;
+
+ if (ret) {
+ printk(KERN_ERR "dma address needs mapping\n");
+ printk(KERN_ERR "mask: %#llx\n address: [%#llx]\n", mask, addr);
+ }
+ return ret;
+}
+
+static int check_pages_physically_contiguous(unsigned long pfn,
+ unsigned int offset,
+ size_t length)
+{
+ unsigned long next_mfn;
+ int i;
+ int nr_pages;
+
+ next_mfn = pfn_to_mfn(pfn);
+ nr_pages = (offset + length + PAGE_SIZE-1) >> PAGE_SHIFT;
+
+ for (i = 1; i < nr_pages; i++) {
+ if (pfn_to_mfn(++pfn) != ++next_mfn)
+ return 0;
+ }
+ return 1;
+}
+
+static int range_straddles_page_boundary(phys_addr_t p, size_t size)
+{
+ unsigned long pfn = PFN_DOWN(p);
+ unsigned int offset = p & ~PAGE_MASK;
+
+ if (offset + size <= PAGE_SIZE)
+ return 0;
+ if (check_pages_physically_contiguous(pfn, offset, size))
+ return 0;
+ return 1;
+}
+
+static inline void xen_dma_unmap_page(struct page *page)
+{
+ /* Xen TODO: 2.6.18 xen calls __gnttab_dma_unmap_page here
+ * to deal with foreign pages. We'll need similar logic here at
+ * some point.
+ */
+}
+
+/* Gets dma address of a page */
+static inline dma_addr_t xen_dma_map_page(struct page *page)
+{
+ /* Xen TODO: 2.6.18 xen calls __gnttab_dma_map_page here to deal
+ * with foreign pages. We'll need similar logic here at some
+ * point.
+ */
+ return ((dma_addr_t)pfn_to_mfn(page_to_pfn(page))) << PAGE_SHIFT;
+}
+
+static int xen_map_sg(struct device *hwdev, struct scatterlist *sg,
+ int nents,
+ enum dma_data_direction direction,
+ struct dma_attrs *attrs)
+{
+ struct scatterlist *s;
+ struct page *page;
+ int i, rc;
+
+ BUG_ON(direction == DMA_NONE);
+ WARN_ON(nents == 0 || sg[0].length == 0);
+
+ for_each_sg(sg, s, nents, i) {
+ BUG_ON(!sg_page(s));
+ page = sg_page(s);
+ s->dma_address = xen_dma_map_page(page) + s->offset;
+ s->dma_length = s->length;
+ IOMMU_BUG_ON(range_straddles_page_boundary(
+ page_to_phys(page), s->length));
+ }
+
+ rc = nents;
+
+ flush_write_buffers();
+ return rc;
+}
+
+static void xen_unmap_sg(struct device *hwdev, struct scatterlist *sg,
+ int nents,
+ enum dma_data_direction direction,
+ struct dma_attrs *attrs)
+{
+ struct scatterlist *s;
+ struct page *page;
+ int i;
+
+ for_each_sg(sg, s, nents, i) {
+ page = pfn_to_page(mfn_to_pfn(PFN_DOWN(s->dma_address)));
+ xen_dma_unmap_page(page);
+ }
+}
+
+static void *xen_alloc_coherent(struct device *dev, size_t size,
+ dma_addr_t *dma_handle, gfp_t gfp)
+{
+ void *ret;
+ struct dma_coherent_mem *mem = dev ? dev->dma_mem : NULL;
+ unsigned int order = get_order(size);
+ unsigned long vstart;
+ u64 mask;
+
+ /* ignore region specifiers */
+ gfp &= ~(__GFP_DMA | __GFP_HIGHMEM);
+
+ if (mem) {
+ int page = bitmap_find_free_region(mem->bitmap, mem->size,
+ order);
+ if (page >= 0) {
+ *dma_handle = mem->device_base + (page << PAGE_SHIFT);
+ ret = mem->virt_base + (page << PAGE_SHIFT);
+ memset(ret, 0, size);
+ return ret;
+ }
+ if (mem->flags & DMA_MEMORY_EXCLUSIVE)
+ return NULL;
+ }
+
+ if (dev == NULL || (dev->coherent_dma_mask < 0xffffffff))
+ gfp |= GFP_DMA;
+
+ vstart = __get_free_pages(gfp, order);
+ ret = (void *)vstart;
+
+ if (dev != NULL && dev->coherent_dma_mask)
+ mask = dev->coherent_dma_mask;
+ else
+ mask = 0xffffffff;
+
+ if (ret != NULL) {
+ if (xen_create_contiguous_region(vstart, order,
+ fls64(mask)) != 0) {
+ free_pages(vstart, order);
+ return NULL;
+ }
+ memset(ret, 0, size);
+ *dma_handle = virt_to_machine(ret).maddr;
+ }
+ return ret;
+}
+
+static void xen_free_coherent(struct device *dev, size_t size,
+ void *vaddr, dma_addr_t dma_addr)
+{
+ struct dma_coherent_mem *mem = dev ? dev->dma_mem : NULL;
+ int order = get_order(size);
+
+ if (mem && vaddr >= mem->virt_base &&
+ vaddr < (mem->virt_base + (mem->size << PAGE_SHIFT))) {
+ int page = (vaddr - mem->virt_base) >> PAGE_SHIFT;
+ bitmap_release_region(mem->bitmap, page, order);
+ } else {
+ xen_destroy_contiguous_region((unsigned long)vaddr, order);
+ free_pages((unsigned long)vaddr, order);
+ }
+}
+
+static dma_addr_t xen_map_page(struct device *dev, struct page *page,
+ unsigned long offset, size_t size,
+ enum dma_data_direction direction,
+ struct dma_attrs *attrs)
+{
+ dma_addr_t dma;
+
+ BUG_ON(direction == DMA_NONE);
+
+ WARN_ON(size == 0);
+
+ dma = xen_dma_map_page(page) + offset;
+
+ IOMMU_BUG_ON(address_needs_mapping(dev, dma));
+ flush_write_buffers();
+ return dma;
+}
+
+static void xen_unmap_page(struct device *dev, dma_addr_t dma_addr,
+ size_t size,
+ enum dma_data_direction direction,
+ struct dma_attrs *attrs)
+{
+ BUG_ON(direction == DMA_NONE);
+ xen_dma_unmap_page(pfn_to_page(mfn_to_pfn(PFN_DOWN(dma_addr))));
+}
+
+static struct dma_map_ops xen_dma_ops = {
+ .dma_supported = NULL,
+
+ .alloc_coherent = xen_alloc_coherent,
+ .free_coherent = xen_free_coherent,
+
+ .map_page = xen_map_page,
+ .unmap_page = xen_unmap_page,
+
+ .map_sg = xen_map_sg,
+ .unmap_sg = xen_unmap_sg,
+
+ .mapping_error = NULL,
+
+ .is_phys = 0,
+};
+
+static struct dma_map_ops xen_swiotlb_dma_ops = {
+ .dma_supported = swiotlb_dma_supported,
+
+ .alloc_coherent = xen_alloc_coherent,
+ .free_coherent = xen_free_coherent,
+
+ .map_page = swiotlb_map_page,
+ .unmap_page = swiotlb_unmap_page,
+
+ .map_sg = swiotlb_map_sg_attrs,
+ .unmap_sg = swiotlb_unmap_sg_attrs,
+
+ .mapping_error = swiotlb_dma_mapping_error,
+
+ .is_phys = 0,
+};
+
+void __init xen_iommu_init(void)
+{
+ if (!xen_pv_domain())
+ return;
+
+ printk(KERN_INFO "Xen: Initializing Xen DMA ops\n");
+
+ force_iommu = 0;
+ dma_ops = &xen_dma_ops;
+
+ if (swiotlb) {
+ printk(KERN_INFO "Xen: Enabling DMA fallback to swiotlb\n");
+ dma_ops = &xen_swiotlb_dma_ops;
+ }
+}
+
--
1.6.0.6

2009-03-13 17:09:28

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: [PATCH 18/27] xen/swiotlb: add sync functions

From: Jeremy Fitzhardinge <[email protected]>

Add all the missing sync functions. This fixes iwlagn.
(Need to think about what to do with non-swiotlb mode.)

Signed-off-by: Jeremy Fitzhardinge <[email protected]>
---
drivers/pci/xen-iommu.c | 7 +++++++
1 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/drivers/pci/xen-iommu.c b/drivers/pci/xen-iommu.c
index 13cfe0c..afcebf9 100644
--- a/drivers/pci/xen-iommu.c
+++ b/drivers/pci/xen-iommu.c
@@ -303,6 +303,13 @@ static struct dma_map_ops xen_swiotlb_dma_ops = {

.mapping_error = swiotlb_dma_mapping_error,

+ .sync_single_for_cpu = swiotlb_sync_single_for_cpu,
+ .sync_single_for_device = swiotlb_sync_single_for_device,
+ .sync_single_range_for_cpu = swiotlb_sync_single_range_for_cpu,
+ .sync_single_range_for_device = swiotlb_sync_single_range_for_device,
+ .sync_sg_for_cpu = swiotlb_sync_sg_for_cpu,
+ .sync_sg_for_device = swiotlb_sync_sg_for_device,
+
.is_phys = 0,
};

--
1.6.0.6

2009-03-13 17:10:01

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: [PATCH 16/27] swiotlb: use swiotlb_alloc_boot to allocate emergency pool

Also fix xen_swiotlb_fixup() to deal with sub-slab-sized allocations.

Signed-off-by: Jeremy Fitzhardinge <[email protected]>
---
drivers/pci/xen-iommu.c | 12 +++++++++---
lib/swiotlb.c | 3 ++-
2 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/drivers/pci/xen-iommu.c b/drivers/pci/xen-iommu.c
index 608b8e2..7b6fd3e 100644
--- a/drivers/pci/xen-iommu.c
+++ b/drivers/pci/xen-iommu.c
@@ -47,16 +47,22 @@ void xen_swiotlb_fixup(void *buf, size_t size, unsigned long nslabs)
buf, size);

dma_bits = get_order(IO_TLB_SEGSIZE << IO_TLB_SHIFT) + PAGE_SHIFT;
- for (i = 0; i < nslabs; i += IO_TLB_SEGSIZE) {
+
+ i = 0;
+ do {
+ int slabs = min(nslabs - i, (unsigned long)IO_TLB_SEGSIZE);
+
do {
rc = xen_create_contiguous_region(
(unsigned long)buf + (i << IO_TLB_SHIFT),
- get_order(IO_TLB_SEGSIZE << IO_TLB_SHIFT),
+ get_order(slabs << IO_TLB_SHIFT),
dma_bits);
} while (rc && dma_bits++ < max_dma_bits);
if (rc)
panic(KERN_ERR "xen_create_contiguous_region failed\n");
- }
+
+ i += slabs;
+ } while(i < nslabs);
}

int xen_wants_swiotlb(void)
diff --git a/lib/swiotlb.c b/lib/swiotlb.c
index 32e2bd3..8e6f6c8 100644
--- a/lib/swiotlb.c
+++ b/lib/swiotlb.c
@@ -202,7 +202,8 @@ swiotlb_init_with_default_size(size_t default_size)
/*
* Get the overflow emergency buffer
*/
- io_tlb_overflow_buffer = alloc_bootmem_low(io_tlb_overflow);
+ io_tlb_overflow_buffer = swiotlb_alloc_boot(io_tlb_overflow,
+ io_tlb_overflow >> IO_TLB_SHIFT);
if (!io_tlb_overflow_buffer)
panic("Cannot allocate SWIOTLB overflow buffer!\n");

--
1.6.0.6

2009-03-13 17:10:34

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: [PATCH 27/27] agp/intel: use dma_alloc_coherent for special cursor memory

From: Jeremy Fitzhardinge <[email protected]>

Given that i810 wants special physically contiguous memory for its cursor,
allocate it with dma_alloc_coherent, which will give us memory with the
right properties. This is particularly for Xen, which won't normally
give us physically contiuous memory.

Signed-off-by: Jeremy Fitzhardinge <[email protected]>
Cc: David Airlie <[email protected]>
---
drivers/char/agp/intel-agp.c | 26 +++++++++++---------------
1 files changed, 11 insertions(+), 15 deletions(-)

diff --git a/drivers/char/agp/intel-agp.c b/drivers/char/agp/intel-agp.c
index 4373adb..8ba6808 100644
--- a/drivers/char/agp/intel-agp.c
+++ b/drivers/char/agp/intel-agp.c
@@ -244,33 +244,29 @@ static void intel_i810_agp_enable(struct agp_bridge_data *bridge, u32 mode)
/* Exists to support ARGB cursors */
static void *i8xx_alloc_pages(void)
{
- struct page *page;
-
- page = alloc_pages(GFP_KERNEL | GFP_DMA32, 2);
- if (page == NULL)
+ void *addr;
+ dma_addr_t _d;
+
+ addr = dma_alloc_coherent(NULL, 4 * PAGE_SIZE, &_d, GFP_KERNEL);
+ if (addr == NULL)
return NULL;

- if (set_pages_uc(page, 4) < 0) {
- set_pages_wb(page, 4);
- __free_pages(page, 2);
+ if (set_memory_uc((unsigned long)addr, 4) < 0) {
+ set_memory_wb((unsigned long)addr, 4);
+ dma_free_coherent(NULL, 4 * PAGE_SIZE, addr, _d);
return NULL;
}
- get_page(page);
atomic_inc(&agp_bridge->current_memory_agp);
- return page_address(page);
+ return addr;
}

static void i8xx_destroy_pages(void *addr)
{
- struct page *page;
-
if (addr == NULL)
return;

- page = virt_to_page(addr);
- set_pages_wb(page, 4);
- put_page(page);
- __free_pages(page, 2);
+ set_memory_wb((unsigned long)addr, 4);
+ dma_free_coherent(NULL, 4 * PAGE_SIZE, addr, virt_to_bus(addr));
atomic_dec(&agp_bridge->current_memory_agp);
}

--
1.6.0.6

2009-03-16 13:29:49

by Jan Beulich

[permalink] [raw]
Subject: Re: [Xen-devel] [PATCH 10/27] xen: make sure swiotlb allocation isphysically contigious

>>> Jeremy Fitzhardinge <[email protected]> 13.03.09 17:59 >>>
> void * __init swiotlb_alloc_boot(size_t size, unsigned long nslabs)
> {
>- return alloc_bootmem_low_pages(size);
>+ void *ret = alloc_bootmem_low_pages(size);
>+
>+ if (ret && xen_pv_domain())
>+ xen_swiotlb_fixup(ret, size, nslabs);
>+
>+ return ret;
> }

While on native using alloc_bootmem_low_pages() is a requirement here,
on Xen this should explicitly not be used, as we realized just a couple of
days ago: The way the bootmem allocator works, running out of space
below 4Gb is pretty easy on machines with lots of memory, and since the
swiotlb is a requirement for Dom0, the risk of allocation failures must be
kept as low as possible.

Jan

2009-03-16 19:37:55

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: [Xen-devel] [PATCH 10/27] xen: make sure swiotlb allocation isphysically contigious

Jan Beulich wrote:
>
> While on native using alloc_bootmem_low_pages() is a requirement here,
> on Xen this should explicitly not be used, as we realized just a couple of
> days ago:

Which conversation was this?

> The way the bootmem allocator works, running out of space
> below 4Gb is pretty easy on machines with lots of memory, and since the
> swiotlb is a requirement for Dom0, the risk of allocation failures must be
> kept as low as possible.
>

Are we talking about a 32 or 64 bit dom0 here? If its 32-bit, then yes,
low memory is a precious resource, but I don't see why that depends on
the total amount of memory. And if its 64-bit, why is below-4G
particularly constrained? That would only matter for 32-bit devices?

J

2009-03-17 07:49:29

by Jan Beulich

[permalink] [raw]
Subject: Re: [Xen-devel] [PATCH 10/27] xen: make sure swiotlb allocationisphysically contigious

>>> Jeremy Fitzhardinge <[email protected]> 16.03.09 20:37 >>>
>Jan Beulich wrote:
>>
>> While on native using alloc_bootmem_low_pages() is a requirement here,
>> on Xen this should explicitly not be used, as we realized just a couple of
>> days ago:
>
>Which conversation was this?

I pointed out the issue in a mail titled "alloc_bootmem_low() mis-uses", in
response to a bug report we had from HP running into an out-of-memory
panic from the bootmem allocator when allocating the 64Mb of swiotlb on
a 256Gb box.

>> The way the bootmem allocator works, running out of space
>> below 4Gb is pretty easy on machines with lots of memory, and since the
>> swiotlb is a requirement for Dom0, the risk of allocation failures must be
>> kept as low as possible.
>>
>
>Are we talking about a 32 or 64 bit dom0 here? If its 32-bit, then yes,
>low memory is a precious resource, but I don't see why that depends on
>the total amount of memory. And if its 64-bit, why is below-4G
>particularly constrained? That would only matter for 32-bit devices?

No, it's in particular about 64-bit (and native has a similar potential problem
here, just that it's not that easily fixable because, as said,
alloc_bootmem_low_pages() is a must there): Since the bootmem allocator
(generally) works from bottom to top, 'normal' (i.e. without the _low suffix)
bootmem allocations may eat up most/all memory below 4Gb, and when
finally the swiotlb initialization runs there's no memory left for it.

For boxes with yet much more memory (as I realized only after having
written that other mail), there's also another issue of the bootmem
allocation bitmap generally getting allocated as low as possible (which
again may eat up all memory below 4Gb), but fixing this at least doesn't
require touching the bootmem code itself (and is of course a really
secondary issue, as it's going to be a couple of years until this bitmap
would reach into the Gb range).

Jan

2009-03-30 21:13:00

by Dave Airlie

[permalink] [raw]
Subject: Re: [PATCH 27/27] agp/intel: use dma_alloc_coherent for special cursor memory



> From: Jeremy Fitzhardinge <[email protected]>
>
> Given that i810 wants special physically contiguous memory for its cursor,
> allocate it with dma_alloc_coherent, which will give us memory with the
> right properties. This is particularly for Xen, which won't normally
> give us physically contiuous memory.
>
> Signed-off-by: Jeremy Fitzhardinge <[email protected]>
> Cc: David Airlie <[email protected]>

Acked-by: David Airlie <[email protected]>

> ---
> drivers/char/agp/intel-agp.c | 26 +++++++++++---------------
> 1 files changed, 11 insertions(+), 15 deletions(-)
>
> diff --git a/drivers/char/agp/intel-agp.c b/drivers/char/agp/intel-agp.c
> index 4373adb..8ba6808 100644
> --- a/drivers/char/agp/intel-agp.c
> +++ b/drivers/char/agp/intel-agp.c
> @@ -244,33 +244,29 @@ static void intel_i810_agp_enable(struct agp_bridge_data *bridge, u32 mode)
> /* Exists to support ARGB cursors */
> static void *i8xx_alloc_pages(void)
> {
> - struct page *page;
> -
> - page = alloc_pages(GFP_KERNEL | GFP_DMA32, 2);
> - if (page == NULL)
> + void *addr;
> + dma_addr_t _d;
> +
> + addr = dma_alloc_coherent(NULL, 4 * PAGE_SIZE, &_d, GFP_KERNEL);
> + if (addr == NULL)
> return NULL;
>
> - if (set_pages_uc(page, 4) < 0) {
> - set_pages_wb(page, 4);
> - __free_pages(page, 2);
> + if (set_memory_uc((unsigned long)addr, 4) < 0) {
> + set_memory_wb((unsigned long)addr, 4);
> + dma_free_coherent(NULL, 4 * PAGE_SIZE, addr, _d);
> return NULL;
> }
> - get_page(page);
> atomic_inc(&agp_bridge->current_memory_agp);
> - return page_address(page);
> + return addr;
> }
>
> static void i8xx_destroy_pages(void *addr)
> {
> - struct page *page;
> -
> if (addr == NULL)
> return;
>
> - page = virt_to_page(addr);
> - set_pages_wb(page, 4);
> - put_page(page);
> - __free_pages(page, 2);
> + set_memory_wb((unsigned long)addr, 4);
> + dma_free_coherent(NULL, 4 * PAGE_SIZE, addr, virt_to_bus(addr));
> atomic_dec(&agp_bridge->current_memory_agp);
> }
>
>

2009-03-30 21:13:43

by Dave Airlie

[permalink] [raw]
Subject: Re: [PATCH 26/27] agp: use more dma-ops-like operations for agp memory

> From: Jeremy Fitzhardinge <[email protected]>
>
> When using AGP under Xen, we need to be careful to
> 1) properly translate between physical and machine addresses, and
> 2) make sure memory is physically contigious when the hardware expects it
>
> This change uses swiotlb_phys_to_bus/bus_to_phys to do the phys<->gart
> conversion, since they already do the right thing, and dma_alloc_coherent
> for gatt allocations. This should work equally well running native.
>
> Signed-off-by: Jeremy Fitzhardinge <[email protected]>
> Cc: David Airlie <[email protected]>

Acked-by: David Airlie <[email protected]>

> ---
> arch/x86/include/asm/agp.h | 15 ++++++++++-----
> lib/swiotlb.c | 2 ++
> 2 files changed, 12 insertions(+), 5 deletions(-)
>
> diff --git a/arch/x86/include/asm/agp.h b/arch/x86/include/asm/agp.h
> index 9825cd6..7ba2639 100644
> --- a/arch/x86/include/asm/agp.h
> +++ b/arch/x86/include/asm/agp.h
> @@ -1,8 +1,11 @@
> #ifndef _ASM_X86_AGP_H
> #define _ASM_X86_AGP_H
>
> +#include <linux/swiotlb.h>
> +
> #include <asm/pgtable.h>
> #include <asm/cacheflush.h>
> +#include <asm/dma-mapping.h>
>
> /*
> * Functions to keep the agpgart mappings coherent with the MMU. The
> @@ -23,13 +26,15 @@
> #define flush_agp_cache() wbinvd()
>
> /* Convert a physical address to an address suitable for the GART. */
> -#define phys_to_gart(x) (x)
> -#define gart_to_phys(x) (x)
> +#define phys_to_gart(x) swiotlb_phys_to_bus(NULL, (x))
> +#define gart_to_phys(x) swiotlb_bus_to_phys(x)
>
> /* GATT allocation. Returns/accepts GATT kernel virtual address. */
> -#define alloc_gatt_pages(order) \
> - ((char *)__get_free_pages(GFP_KERNEL, (order)))
> +#define alloc_gatt_pages(order) ({ \
> + char *_t; dma_addr_t _d; \
> + _t = dma_alloc_coherent(NULL,PAGE_SIZE<<(order),&_d,GFP_KERNEL); \
> + _t; })
> #define free_gatt_pages(table, order) \
> - free_pages((unsigned long)(table), (order))
> + dma_free_coherent(NULL,PAGE_SIZE<<(order),(table),virt_to_bus(table))
>
> #endif /* _ASM_X86_AGP_H */
> diff --git a/lib/swiotlb.c b/lib/swiotlb.c
> index 8e6f6c8..98fb7d3 100644
> --- a/lib/swiotlb.c
> +++ b/lib/swiotlb.c
> @@ -128,11 +128,13 @@ dma_addr_t __weak swiotlb_phys_to_bus(struct device *hwdev, phys_addr_t paddr)
> {
> return paddr;
> }
> +EXPORT_SYMBOL_GPL(swiotlb_phys_to_bus);
>
> phys_addr_t __weak swiotlb_bus_to_phys(dma_addr_t baddr)
> {
> return baddr;
> }
> +EXPORT_SYMBOL_GPL(swiotlb_bus_to_phys);
>
> static dma_addr_t swiotlb_virt_to_bus(struct device *hwdev,
> volatile void *address)
>