2013-05-13 23:58:57

by Andy Lutomirski

[permalink] [raw]
Subject: [PATCH v3 0/9] Clean up write-combining MTRR addition

A fair number of drivers (mostly graphics) add write-combining MTRRs.
Most ignore errors and most add the MTRR even on PAT systems which don't
need to use MTRRs.

This series adds new functions arch_phys_wc_{add,del} that, on PAT-less
x86 systems with MTRRs, add MTRRs and report errors, and that do nothing
otherwise. (Other architectures, if any, with a similar mechanism could
implement them.)

I've only tested the radeon driver, since I don't have test hardware
easily available for the other drivers.

Benefits include:
- Simpler code
- No more complaints about MTRR conflict warnings on PAT systems
- Eventual unexporting of the MTRR API?

This series eliminates about half of the mtrr_add calls in drivers/.

Note: this series breaks and then fixes dritests from libdrm. The breakage
is probably irrelevant for any practical purpose, and fixing it will
be a bit complicated due to header file breakage that's only fixed in
patch 8.

Daniel, can you check that patch 1 is still okay?

Changes from v2:
- There's a new API phys_wc_to_mtrr_index (x86-only) to support drmGetMap
- Minor cleanups to patche 1 and 5.
- Patch 9 is new

Changes from v1:
- Helpers renamed
- Lots of bugs fixed

Andy Lutomirski (9):
Add arch_phys_wc_{add,del} to manipulate WC MTRRs if needed
drm (ast,cirrus,mgag200,nouveau,savage,vmwgfx): Remove
drm_mtrr_{add,del}
drm: Update drm_addmap and drm_mmap to use PAT WC instead of MTRRs
drm,agpgart: Use pgprot_writecombine for AGP maps and make the MTRR
optional
i915: Use arch_phys_wc_{add,del}
radeon: Switch to arch_phys_wc_add and add a missing ..._del
uvesafb: Clean up MTRR code
drm: Remove mtrr_add and mtrr_del fallback hack for non-MTRR systems
drm: Don't leak phys_wc "handles" to userspace

Andy Lutomirski (9):
Add arch_phys_wc_{add,del} to manipulate WC MTRRs if needed
drm (ast,cirrus,mgag200,nouveau,savage,vmwgfx): Remove
drm_mtrr_{add,del}
drm: Update drm_addmap and drm_mmap to use PAT WC instead of MTRRs
drm,agpgart: Use pgprot_writecombine for AGP maps and make the MTRR
optional
i915: Use arch_phys_wc_{add,del}
radeon: Switch to arch_phys_wc_add and add a missing ..._del
uvesafb: Clean up MTRR code
drm: Remove mtrr_add and mtrr_del fallback hack for non-MTRR systems
drm: Don't leak phys_wc "handles" to userspace

Documentation/fb/uvesafb.txt | 16 +++-----
arch/x86/include/asm/io.h | 7 ++++
arch/x86/include/asm/mtrr.h | 10 ++++-
arch/x86/kernel/cpu/mtrr/main.c | 71 ++++++++++++++++++++++++++++++++++
drivers/char/agp/frontend.c | 8 ++--
drivers/gpu/drm/ast/ast_ttm.c | 13 ++-----
drivers/gpu/drm/cirrus/cirrus_ttm.c | 15 ++-----
drivers/gpu/drm/drm_bufs.c | 26 +++++++++----
drivers/gpu/drm/drm_ioctl.c | 15 ++++++-
drivers/gpu/drm/drm_pci.c | 8 ++--
drivers/gpu/drm/drm_stub.c | 10 +----
drivers/gpu/drm/drm_vm.c | 22 +++++------
drivers/gpu/drm/i915/i915_dma.c | 42 ++------------------
drivers/gpu/drm/mgag200/mgag200_ttm.c | 14 ++-----
drivers/gpu/drm/nouveau/nouveau_ttm.c | 13 ++-----
drivers/gpu/drm/radeon/radeon_object.c | 5 ++-
drivers/gpu/drm/savage/savage_bci.c | 43 +++++++-------------
drivers/gpu/drm/savage/savage_drv.h | 5 +--
drivers/gpu/drm/vmwgfx/vmwgfx_drv.c | 10 ++---
drivers/video/uvesafb.c | 70 ++++++++-------------------------
include/drm/drmP.h | 34 +---------------
include/drm/drm_os_linux.h | 16 --------
include/linux/io.h | 25 ++++++++++++
include/video/uvesafb.h | 1 +
24 files changed, 230 insertions(+), 269 deletions(-)

--
1.8.1.4


2013-05-13 23:59:03

by Andy Lutomirski

[permalink] [raw]
Subject: [PATCH v3 1/9] Add arch_phys_wc_{add,del} to manipulate WC MTRRs if needed

Several drivers currently use mtrr_add through various #ifdef guards
and/or drm wrappers. The vast majority of them want to add WC MTRRs
on x86 systems and don't actually need the MTRR if PAT (i.e.
ioremap_wc, etc) are working.

arch_phys_wc_add and arch_phys_wc_del are new functions, available
on all architectures and configurations, that add WC MTRRs on x86 if
needed (and handle errors) and do nothing at all otherwise. They're
also easier to use than mtrr_add and mtrr_del, so the call sites can
be simplified.

As an added benefit, this will avoid wasting MTRRs and possibly
warning pointlessly on PAT-supporting systems.

Reviewed-by: Daniel Vetter <[email protected]>
Signed-off-by: Andy Lutomirski <[email protected]>
---

Changes from v2:
- Make MTRR_TO_PHYS_WC_OFFSET a macro instead of open-coding it.
- Add phys_wc_to_mtrr_index to make drmGetMap keep working. Sigh.

arch/x86/include/asm/io.h | 7 ++++
arch/x86/include/asm/mtrr.h | 10 +++++-
arch/x86/kernel/cpu/mtrr/main.c | 71 +++++++++++++++++++++++++++++++++++++++++
include/linux/io.h | 25 +++++++++++++++
4 files changed, 112 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h
index d8e8eef..34f69cb 100644
--- a/arch/x86/include/asm/io.h
+++ b/arch/x86/include/asm/io.h
@@ -345,4 +345,11 @@ extern bool xen_biovec_phys_mergeable(const struct bio_vec *vec1,

#define IO_SPACE_LIMIT 0xffff

+#ifdef CONFIG_MTRR
+extern int __must_check arch_phys_wc_add(unsigned long base,
+ unsigned long size);
+extern void arch_phys_wc_del(int handle);
+#define arch_phys_wc_add arch_phys_wc_add
+#endif
+
#endif /* _ASM_X86_IO_H */
diff --git a/arch/x86/include/asm/mtrr.h b/arch/x86/include/asm/mtrr.h
index e235582..f768f62 100644
--- a/arch/x86/include/asm/mtrr.h
+++ b/arch/x86/include/asm/mtrr.h
@@ -26,7 +26,10 @@
#include <uapi/asm/mtrr.h>


-/* The following functions are for use by other drivers */
+/*
+ * The following functions are for use by other drivers that cannot use
+ * arch_phys_wc_add and arch_phys_wc_del.
+ */
# ifdef CONFIG_MTRR
extern u8 mtrr_type_lookup(u64 addr, u64 end);
extern void mtrr_save_fixed_ranges(void *);
@@ -45,6 +48,7 @@ extern void mtrr_aps_init(void);
extern void mtrr_bp_restore(void);
extern int mtrr_trim_uncached_memory(unsigned long end_pfn);
extern int amd_special_default_mtrr(void);
+extern int phys_wc_to_mtrr_index(int handle);
# else
static inline u8 mtrr_type_lookup(u64 addr, u64 end)
{
@@ -80,6 +84,10 @@ static inline int mtrr_trim_uncached_memory(unsigned long end_pfn)
static inline void mtrr_centaur_report_mcr(int mcr, u32 lo, u32 hi)
{
}
+static inline int phys_wc_to_mtrr_index(int handle)
+{
+ return -1;
+}

#define mtrr_ap_init() do {} while (0)
#define mtrr_bp_init() do {} while (0)
diff --git a/arch/x86/kernel/cpu/mtrr/main.c b/arch/x86/kernel/cpu/mtrr/main.c
index 726bf96..3533d4d 100644
--- a/arch/x86/kernel/cpu/mtrr/main.c
+++ b/arch/x86/kernel/cpu/mtrr/main.c
@@ -51,9 +51,13 @@
#include <asm/e820.h>
#include <asm/mtrr.h>
#include <asm/msr.h>
+#include <asm/pat.h>

#include "mtrr.h"

+/* arch_phys_wc_add returns an MTRR register index plus this offset. */
+#define MTRR_TO_PHYS_WC_OFFSET 1000
+
u32 num_var_ranges;

unsigned int mtrr_usage_table[MTRR_MAX_VAR_RANGES];
@@ -524,6 +528,73 @@ int mtrr_del(int reg, unsigned long base, unsigned long size)
}
EXPORT_SYMBOL(mtrr_del);

+/**
+ * arch_phys_wc_add - add a WC MTRR and handle errors if PAT is unavailable
+ * @base: Physical base address
+ * @size: Size of region
+ *
+ * If PAT is available, this does nothing. If PAT is unavailable, it
+ * attempts to add a WC MTRR covering size bytes starting at base and
+ * logs an error if this fails.
+ *
+ * Drivers must store the return value to pass to mtrr_del_wc_if_needed,
+ * but drivers should not try to interpret that return value.
+ */
+int arch_phys_wc_add(unsigned long base, unsigned long size)
+{
+ int ret;
+
+ if (pat_enabled)
+ return 0; /* Success! (We don't need to do anything.) */
+
+ ret = mtrr_add(base, size, MTRR_TYPE_WRCOMB, true);
+ if (ret < 0) {
+ pr_warn("Failed to add WC MTRR for [%p-%p]; performance may suffer.",
+ (void *)base, (void *)(base + size - 1));
+ return ret;
+ }
+ return ret + MTRR_TO_PHYS_WC_OFFSET;
+}
+EXPORT_SYMBOL(arch_phys_wc_add);
+
+/*
+ * arch_phys_wc_del - undoes arch_phys_wc_add
+ * @handle: Return value from arch_phys_wc_add
+ *
+ * This cleans up after mtrr_add_wc_if_needed.
+ *
+ * The API guarantees that mtrr_del_wc_if_needed(error code) and
+ * mtrr_del_wc_if_needed(0) do nothing.
+ */
+void arch_phys_wc_del(int handle)
+{
+ if (handle >= 1) {
+ WARN_ON(handle < MTRR_TO_PHYS_WC_OFFSET);
+ mtrr_del(handle - MTRR_TO_PHYS_WC_OFFSET, 0, 0);
+ }
+}
+EXPORT_SYMBOL(arch_phys_wc_del);
+
+/*
+ * phys_wc_to_mtrr_index - translates arch_phys_wc_add's return value
+ * @handle: Return value from arch_phys_wc_add
+ *
+ * This will turn the return value from arch_phys_wc_add into an mtrr
+ * index suitable for debugging.
+ *
+ * Note: There is no legitimate use for this function, except possibly
+ * in printk line. Alas there is an illegitimate use in some ancient
+ * drm ioctls.
+ */
+int phys_wc_to_mtrr_index(int handle)
+{
+ if (handle < MTRR_TO_PHYS_WC_OFFSET)
+ return -1;
+ else
+ return handle - MTRR_TO_PHYS_WC_OFFSET;
+}
+EXPORT_SYMBOL_GPL(phys_wc_to_mtrr_index);
+
/*
* HACK ALERT!
* These should be called implicitly, but we can't yet until all the initcall
diff --git a/include/linux/io.h b/include/linux/io.h
index 069e407..f4f42fa 100644
--- a/include/linux/io.h
+++ b/include/linux/io.h
@@ -76,4 +76,29 @@ void devm_ioremap_release(struct device *dev, void *res);
#define arch_has_dev_port() (1)
#endif

+/*
+ * Some systems (x86 without PAT) have a somewhat reliable way to mark a
+ * physical address range such that uncached mappings will actually
+ * end up write-combining. This facility should be used in conjunction
+ * with pgprot_writecombine, ioremap-wc, or set_memory_wc, since it has
+ * no effect if the per-page mechanisms are functional.
+ * (On x86 without PAT, these functions manipulate MTRRs.)
+ *
+ * arch_phys_del_wc(0) or arch_phys_del_wc(any error code) is guaranteed
+ * to have no effect.
+ */
+#ifndef arch_phys_wc_add
+static inline int __must_check arch_phys_wc_add(unsigned long base,
+ unsigned long size)
+{
+ return 0; /* It worked (i.e. did nothing). */
+}
+
+static inline void arch_phys_wc_del(int handle)
+{
+}
+
+#define arch_phys_wc_add arch_phys_wc_add
+#endif
+
#endif /* _LINUX_IO_H */
--
1.8.1.4

2013-05-13 23:59:23

by Andy Lutomirski

[permalink] [raw]
Subject: [PATCH v3 7/9] uvesafb: Clean up MTRR code

The old code allowed very strange memory types. Now it works like
all the other video drivers: ioremap_wc is used unconditionally,
and MTRRs are set if PAT is unavailable (unless MTRR is disabled
by a module parameter).

UC, WB, and WT support is gone. If there are MTRR conflicts that prevent
addition of a WC MTRR, adding a non-conflicting MTRR is pointless; it's
better to just turn off MTRR support entirely.

As an added bonus, any MTRR added is freed on unload.

Reviewed-by: Daniel Vetter <[email protected]>
Signed-off-by: Andy Lutomirski <[email protected]>
---
Documentation/fb/uvesafb.txt | 16 ++++------
drivers/video/uvesafb.c | 70 +++++++++++---------------------------------
include/video/uvesafb.h | 1 +
3 files changed, 23 insertions(+), 64 deletions(-)

diff --git a/Documentation/fb/uvesafb.txt b/Documentation/fb/uvesafb.txt
index eefdd91..f6362d8 100644
--- a/Documentation/fb/uvesafb.txt
+++ b/Documentation/fb/uvesafb.txt
@@ -81,17 +81,11 @@ pmipal Use the protected mode interface for palette changes.

mtrr:n Setup memory type range registers for the framebuffer
where n:
- 0 - disabled (equivalent to nomtrr) (default)
- 1 - uncachable
- 2 - write-back
- 3 - write-combining
- 4 - write-through
-
- If you see the following in dmesg, choose the type that matches
- the old one. In this example, use "mtrr:2".
-...
-mtrr: type mismatch for e0000000,8000000 old: write-back new: write-combining
-...
+ 0 - disabled (equivalent to nomtrr)
+ 3 - write-combining (default)
+
+ Values other than 0 and 3 will result in a warning and will be
+ treated just like 3.

nomtrr Do not use memory type range registers.

diff --git a/drivers/video/uvesafb.c b/drivers/video/uvesafb.c
index d428445..8701f96 100644
--- a/drivers/video/uvesafb.c
+++ b/drivers/video/uvesafb.c
@@ -24,9 +24,6 @@
#ifdef CONFIG_X86
#include <video/vga.h>
#endif
-#ifdef CONFIG_MTRR
-#include <asm/mtrr.h>
-#endif
#include "edid.h"

static struct cb_id uvesafb_cn_id = {
@@ -1540,67 +1537,30 @@ static void uvesafb_init_info(struct fb_info *info, struct vbe_mode_ib *mode)

static void uvesafb_init_mtrr(struct fb_info *info)
{
-#ifdef CONFIG_MTRR
+ struct uvesafb_par *par = info->par;
+
if (mtrr && !(info->fix.smem_start & (PAGE_SIZE - 1))) {
int temp_size = info->fix.smem_len;
- unsigned int type = 0;

- switch (mtrr) {
- case 1:
- type = MTRR_TYPE_UNCACHABLE;
- break;
- case 2:
- type = MTRR_TYPE_WRBACK;
- break;
- case 3:
- type = MTRR_TYPE_WRCOMB;
- break;
- case 4:
- type = MTRR_TYPE_WRTHROUGH;
- break;
- default:
- type = 0;
- break;
- }
+ int rc;

- if (type) {
- int rc;
+ /* Find the largest power-of-two */
+ temp_size = roundup_pow_of_two(temp_size);

- /* Find the largest power-of-two */
- temp_size = roundup_pow_of_two(temp_size);
+ /* Try and find a power of two to add */
+ do {
+ rc = arch_phys_wc_add(info->fix.smem_start, temp_size);
+ temp_size >>= 1;
+ } while (temp_size >= PAGE_SIZE && rc == -EINVAL);

- /* Try and find a power of two to add */
- do {
- rc = mtrr_add(info->fix.smem_start,
- temp_size, type, 1);
- temp_size >>= 1;
- } while (temp_size >= PAGE_SIZE && rc == -EINVAL);
- }
+ if (rc >= 0)
+ par->mtrr_handle = rc;
}
-#endif /* CONFIG_MTRR */
}

static void uvesafb_ioremap(struct fb_info *info)
{
-#ifdef CONFIG_X86
- switch (mtrr) {
- case 1: /* uncachable */
- info->screen_base = ioremap_nocache(info->fix.smem_start, info->fix.smem_len);
- break;
- case 2: /* write-back */
- info->screen_base = ioremap_cache(info->fix.smem_start, info->fix.smem_len);
- break;
- case 3: /* write-combining */
- info->screen_base = ioremap_wc(info->fix.smem_start, info->fix.smem_len);
- break;
- case 4: /* write-through */
- default:
- info->screen_base = ioremap(info->fix.smem_start, info->fix.smem_len);
- break;
- }
-#else
- info->screen_base = ioremap(info->fix.smem_start, info->fix.smem_len);
-#endif /* CONFIG_X86 */
+ info->screen_base = ioremap_wc(info->fix.smem_start, info->fix.smem_len);
}

static ssize_t uvesafb_show_vbe_ver(struct device *dev,
@@ -1851,6 +1811,7 @@ static int uvesafb_remove(struct platform_device *dev)
unregister_framebuffer(info);
release_region(0x3c0, 32);
iounmap(info->screen_base);
+ arch_phys_wc_del(par->mtrr_handle);
release_mem_region(info->fix.smem_start, info->fix.smem_len);
fb_destroy_modedb(info->monspecs.modedb);
fb_dealloc_cmap(&info->cmap);
@@ -1930,6 +1891,9 @@ static int uvesafb_setup(char *options)
}
}

+ if (mtrr != 3 && mtrr != 1)
+ pr_warn("uvesafb: mtrr should be set to 0 or 3; %d is unsupported", mtrr);
+
return 0;
}
#endif /* !MODULE */
diff --git a/include/video/uvesafb.h b/include/video/uvesafb.h
index 1a91850..30f5362 100644
--- a/include/video/uvesafb.h
+++ b/include/video/uvesafb.h
@@ -134,6 +134,7 @@ struct uvesafb_par {

int mode_idx;
struct vbe_crtc_ib crtc;
+ int mtrr_handle;
};

#endif /* _UVESAFB_H */
--
1.8.1.4

2013-05-13 23:59:29

by Andy Lutomirski

[permalink] [raw]
Subject: [PATCH v3 9/9] drm: Don't leak phys_wc "handles" to userspace

I didn't fix this in the earlier patch -- it would have broken the
build due to the now-deleted garbage in drm_os_linux.h.

Reviewed-by: Daniel Vetter <[email protected]>
Signed-off-by: Andy Lutomirski <[email protected]>
---
drivers/gpu/drm/drm_bufs.c | 9 +++++++++
drivers/gpu/drm/drm_ioctl.c | 15 ++++++++++++++-
2 files changed, 23 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/drm_bufs.c b/drivers/gpu/drm/drm_bufs.c
index 0190fce..5a4dbb4 100644
--- a/drivers/gpu/drm/drm_bufs.c
+++ b/drivers/gpu/drm/drm_bufs.c
@@ -414,6 +414,15 @@ int drm_addmap_ioctl(struct drm_device *dev, void *data,

/* avoid a warning on 64-bit, this casting isn't very nice, but the API is set so too late */
map->handle = (void *)(unsigned long)maplist->user_token;
+
+ /*
+ * It appears that there are no users of this value whatsoever --
+ * drmAddMap just discards it. Let's not encourage its use.
+ * (Keeping drm_addmap_core's returned mtrr value would be wrong --
+ * it's not a real mtrr index anymore.)
+ */
+ map->mtrr = -1;
+
return 0;
}

diff --git a/drivers/gpu/drm/drm_ioctl.c b/drivers/gpu/drm/drm_ioctl.c
index e77bd8b..ffd7a7b 100644
--- a/drivers/gpu/drm/drm_ioctl.c
+++ b/drivers/gpu/drm/drm_ioctl.c
@@ -38,6 +38,9 @@

#include <linux/pci.h>
#include <linux/export.h>
+#ifdef CONFIG_X86
+#include <asm/mtrr.h>
+#endif

/**
* Get the bus id.
@@ -181,7 +184,17 @@ int drm_getmap(struct drm_device *dev, void *data,
map->type = r_list->map->type;
map->flags = r_list->map->flags;
map->handle = (void *)(unsigned long) r_list->user_token;
- map->mtrr = r_list->map->mtrr;
+
+#ifdef CONFIG_X86
+ /*
+ * There appears to be exactly one user of the mtrr index: dritest.
+ * It's easy enough to keep it working on non-PAT systems.
+ */
+ map->mtrr = phys_wc_to_mtrr_index(r_list->map->mtrr);
+#else
+ map->mtrr = -1;
+#endif
+
mutex_unlock(&dev->struct_mutex);

return 0;
--
1.8.1.4

2013-05-13 23:59:18

by Andy Lutomirski

[permalink] [raw]
Subject: [PATCH v3 4/9] drm,agpgart: Use pgprot_writecombine for AGP maps and make the MTRR optional

I'm not sure I understand the intent of the previous behavior. mmap
on /dev/agpgart and DRM_AGP maps had no cache flags set, so they
would be fully cacheable. But the DRM code (most of the time) would
add a write-combining MTRR that would change the effective memory
type to WC.

The new behavior just requests WC explicitly for all AGP maps.

If there is any code out there that expects cacheable access to the
AGP aperture (because the drm driver doesn't request an MTRR or
because it's using /dev/agpgart directly), then it will now end up
with a UC or WC mapping, depending on the architecture and PAT
availability. But cacheable access to the aperture seems like it's
asking for trouble, because, AIUI, the aperture is an alias of RAM.

Reviewed-by: Daniel Vetter <[email protected]>
Signed-off-by: Andy Lutomirski <[email protected]>
---

It's conceivable that libpciaccess could have issues with this due to
memtype conflicts, but I think this is unlikely (as long as a WC
mapping gets there first, I think the conflict resolution rules will
all work out). In any case, everything that maps the AGP aperture
ought to be using /dev/agpgart or the DRM API, in which case
everything should be okay.

I don't have anything with an AGP slot to test AFAIK.

drivers/char/agp/frontend.c | 8 +++++---
drivers/gpu/drm/drm_pci.c | 8 ++++----
drivers/gpu/drm/drm_stub.c | 10 ++--------
drivers/gpu/drm/drm_vm.c | 11 ++++-------
4 files changed, 15 insertions(+), 22 deletions(-)

diff --git a/drivers/char/agp/frontend.c b/drivers/char/agp/frontend.c
index 2e04433..1b19239 100644
--- a/drivers/char/agp/frontend.c
+++ b/drivers/char/agp/frontend.c
@@ -603,7 +603,8 @@ static int agp_mmap(struct file *file, struct vm_area_struct *vma)
vma->vm_ops = kerninfo.vm_ops;
} else if (io_remap_pfn_range(vma, vma->vm_start,
(kerninfo.aper_base + offset) >> PAGE_SHIFT,
- size, vma->vm_page_prot)) {
+ size,
+ pgprot_writecombine(vma->vm_page_prot))) {
goto out_again;
}
mutex_unlock(&(agp_fe.agp_mutex));
@@ -618,8 +619,9 @@ static int agp_mmap(struct file *file, struct vm_area_struct *vma)
if (kerninfo.vm_ops) {
vma->vm_ops = kerninfo.vm_ops;
} else if (io_remap_pfn_range(vma, vma->vm_start,
- kerninfo.aper_base >> PAGE_SHIFT,
- size, vma->vm_page_prot)) {
+ kerninfo.aper_base >> PAGE_SHIFT,
+ size,
+ pgprot_writecombine(vma->vm_page_prot))) {
goto out_again;
}
mutex_unlock(&(agp_fe.agp_mutex));
diff --git a/drivers/gpu/drm/drm_pci.c b/drivers/gpu/drm/drm_pci.c
index bd719e9..d0f6699 100644
--- a/drivers/gpu/drm/drm_pci.c
+++ b/drivers/gpu/drm/drm_pci.c
@@ -278,10 +278,10 @@ int drm_pci_agp_init(struct drm_device *dev)
}
if (drm_core_has_MTRR(dev)) {
if (dev->agp)
- dev->agp->agp_mtrr =
- mtrr_add(dev->agp->agp_info.aper_base,
- dev->agp->agp_info.aper_size *
- 1024 * 1024, MTRR_TYPE_WRCOMB, 1);
+ dev->agp->agp_mtrr = arch_phys_wc_add(
+ dev->agp->agp_info.aper_base,
+ dev->agp->agp_info.aper_size *
+ 1024 * 1024);
}
}
return 0;
diff --git a/drivers/gpu/drm/drm_stub.c b/drivers/gpu/drm/drm_stub.c
index 7d30802..9e2acdf 100644
--- a/drivers/gpu/drm/drm_stub.c
+++ b/drivers/gpu/drm/drm_stub.c
@@ -451,14 +451,8 @@ void drm_put_dev(struct drm_device *dev)

drm_lastclose(dev);

- if (drm_core_has_MTRR(dev) && drm_core_has_AGP(dev) &&
- dev->agp && dev->agp->agp_mtrr >= 0) {
- int retval;
- retval = mtrr_del(dev->agp->agp_mtrr,
- dev->agp->agp_info.aper_base,
- dev->agp->agp_info.aper_size * 1024 * 1024);
- DRM_DEBUG("mtrr_del=%d\n", retval);
- }
+ if (drm_core_has_MTRR(dev) && drm_core_has_AGP(dev) && dev->agp)
+ arch_phys_wc_del(dev->agp->agp_mtrr);

if (dev->driver->unload)
dev->driver->unload(dev);
diff --git a/drivers/gpu/drm/drm_vm.c b/drivers/gpu/drm/drm_vm.c
index 163f436..b1e4ec8 100644
--- a/drivers/gpu/drm/drm_vm.c
+++ b/drivers/gpu/drm/drm_vm.c
@@ -49,13 +49,10 @@ static pgprot_t drm_io_prot(struct drm_local_map *map,
pgprot_t tmp = vm_get_page_prot(vma->vm_flags);

#if defined(__i386__) || defined(__x86_64__)
- if (map->type != _DRM_AGP) {
- if (map->type == _DRM_FRAME_BUFFER ||
- map->flags & _DRM_WRITE_COMBINING)
- tmp = pgprot_writecombine(tmp);
- else
- tmp = pgprot_noncached(tmp);
- }
+ if (map->type == _DRM_REGISTERS && !(map->flags & _DRM_WRITE_COMBINING))
+ tmp = pgprot_noncached(tmp);
+ else
+ tmp = pgprot_writecombine(tmp);
#elif defined(__powerpc__)
pgprot_val(tmp) |= _PAGE_NO_CACHE;
if (map_type == _DRM_REGISTERS)
--
1.8.1.4

2013-05-13 23:59:54

by Andy Lutomirski

[permalink] [raw]
Subject: [PATCH v3 8/9] drm: Remove mtrr_add and mtrr_del fallback hack for non-MTRR systems

There are no users left in drivers/gpu.

Reviewed-by: Daniel Vetter <[email protected]>
Signed-off-by: Andy Lutomirski <[email protected]>
---
include/drm/drmP.h | 5 +----
include/drm/drm_os_linux.h | 16 ----------------
2 files changed, 1 insertion(+), 20 deletions(-)

diff --git a/include/drm/drmP.h b/include/drm/drmP.h
index 3e6cfa0..7a9fef5 100644
--- a/include/drm/drmP.h
+++ b/include/drm/drmP.h
@@ -55,16 +55,13 @@
#include <linux/mm.h>
#include <linux/cdev.h>
#include <linux/mutex.h>
+#include <linux/io.h>
#include <linux/slab.h>
#if defined(__alpha__) || defined(__powerpc__)
#include <asm/pgtable.h> /* For pte_wrprotect */
#endif
-#include <asm/io.h>
#include <asm/mman.h>
#include <asm/uaccess.h>
-#ifdef CONFIG_MTRR
-#include <asm/mtrr.h>
-#endif
#if defined(CONFIG_AGP) || defined(CONFIG_AGP_MODULE)
#include <linux/types.h>
#include <linux/agp_backend.h>
diff --git a/include/drm/drm_os_linux.h b/include/drm/drm_os_linux.h
index 3933691..35c7c2b 100644
--- a/include/drm/drm_os_linux.h
+++ b/include/drm/drm_os_linux.h
@@ -65,22 +65,6 @@ struct no_agp_kern {
#define DRM_AGP_KERN struct no_agp_kern
#endif

-#if !(__OS_HAS_MTRR)
-static __inline__ int mtrr_add(unsigned long base, unsigned long size,
- unsigned int type, char increment)
-{
- return -ENODEV;
-}
-
-static __inline__ int mtrr_del(int reg, unsigned long base, unsigned long size)
-{
- return -ENODEV;
-}
-
-#define MTRR_TYPE_WRCOMB 1
-
-#endif
-
/** Other copying of data to kernel space */
#define DRM_COPY_FROM_USER(arg1, arg2, arg3) \
copy_from_user(arg1, arg2, arg3)
--
1.8.1.4

2013-05-14 00:00:27

by Andy Lutomirski

[permalink] [raw]
Subject: [PATCH v3 6/9] radeon: Switch to arch_phys_wc_add and add a missing ..._del

Reviewed-by: Daniel Vetter <[email protected]>
Signed-off-by: Andy Lutomirski <[email protected]>
---
drivers/gpu/drm/radeon/radeon_object.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_object.c b/drivers/gpu/drm/radeon/radeon_object.c
index d3aface..15cd34b 100644
--- a/drivers/gpu/drm/radeon/radeon_object.c
+++ b/drivers/gpu/drm/radeon/radeon_object.c
@@ -321,8 +321,8 @@ void radeon_bo_force_delete(struct radeon_device *rdev)
int radeon_bo_init(struct radeon_device *rdev)
{
/* Add an MTRR for the VRAM */
- rdev->mc.vram_mtrr = mtrr_add(rdev->mc.aper_base, rdev->mc.aper_size,
- MTRR_TYPE_WRCOMB, 1);
+ rdev->mc.vram_mtrr = arch_phys_wc_add(rdev->mc.aper_base,
+ rdev->mc.aper_size);
DRM_INFO("Detected VRAM RAM=%lluM, BAR=%lluM\n",
rdev->mc.mc_vram_size >> 20,
(unsigned long long)rdev->mc.aper_size >> 20);
@@ -334,6 +334,7 @@ int radeon_bo_init(struct radeon_device *rdev)
void radeon_bo_fini(struct radeon_device *rdev)
{
radeon_ttm_fini(rdev);
+ arch_phys_wc_del(rdev->mc.vram_mtrr);
}

void radeon_bo_list_add_object(struct radeon_bo_list *lobj,
--
1.8.1.4

2013-05-13 23:59:15

by Andy Lutomirski

[permalink] [raw]
Subject: [PATCH v3 3/9] drm: Update drm_addmap and drm_mmap to use PAT WC instead of MTRRs

Previously, DRM_FRAME_BUFFER mappings, as well as DRM_REGISTERS
mappings with DRM_WRITE_COMBINING set, resulted in an unconditional
MTRR being added but the actual mappings being created as UC-.

Now these mappings have the MTRR added only if needed, but they will
be mapped with pgprot_writecombine.

The non-WC DRM_REGISTERS case now uses pgprot_noncached instead of
hardcoding the bit twiddling.

The DRM_AGP case is unchanged for now.

Reviewed-by: Daniel Vetter <[email protected]>
Signed-off-by: Andy Lutomirski <[email protected]>
---
drivers/gpu/drm/drm_bufs.c | 17 +++++++++--------
drivers/gpu/drm/drm_vm.c | 23 +++++++++++------------
2 files changed, 20 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/drm_bufs.c b/drivers/gpu/drm/drm_bufs.c
index 0128147..0190fce 100644
--- a/drivers/gpu/drm/drm_bufs.c
+++ b/drivers/gpu/drm/drm_bufs.c
@@ -210,12 +210,16 @@ static int drm_addmap_core(struct drm_device * dev, resource_size_t offset,
if (drm_core_has_MTRR(dev)) {
if (map->type == _DRM_FRAME_BUFFER ||
(map->flags & _DRM_WRITE_COMBINING)) {
- map->mtrr = mtrr_add(map->offset, map->size,
- MTRR_TYPE_WRCOMB, 1);
+ map->mtrr =
+ arch_phys_wc_add(map->offset, map->size);
}
}
if (map->type == _DRM_REGISTERS) {
- map->handle = ioremap(map->offset, map->size);
+ if (map->flags & _DRM_WRITE_COMBINING)
+ map->handle = ioremap_wc(map->offset,
+ map->size);
+ else
+ map->handle = ioremap(map->offset, map->size);
if (!map->handle) {
kfree(map);
return -ENOMEM;
@@ -451,11 +455,8 @@ int drm_rmmap_locked(struct drm_device *dev, struct drm_local_map *map)
iounmap(map->handle);
/* FALLTHROUGH */
case _DRM_FRAME_BUFFER:
- if (drm_core_has_MTRR(dev) && map->mtrr >= 0) {
- int retcode;
- retcode = mtrr_del(map->mtrr, map->offset, map->size);
- DRM_DEBUG("mtrr_del=%d\n", retcode);
- }
+ if (drm_core_has_MTRR(dev))
+ arch_phys_wc_del(map->mtrr);
break;
case _DRM_SHM:
vfree(map->handle);
diff --git a/drivers/gpu/drm/drm_vm.c b/drivers/gpu/drm/drm_vm.c
index db7bd29..163f436 100644
--- a/drivers/gpu/drm/drm_vm.c
+++ b/drivers/gpu/drm/drm_vm.c
@@ -43,14 +43,18 @@
static void drm_vm_open(struct vm_area_struct *vma);
static void drm_vm_close(struct vm_area_struct *vma);

-static pgprot_t drm_io_prot(uint32_t map_type, struct vm_area_struct *vma)
+static pgprot_t drm_io_prot(struct drm_local_map *map,
+ struct vm_area_struct *vma)
{
pgprot_t tmp = vm_get_page_prot(vma->vm_flags);

#if defined(__i386__) || defined(__x86_64__)
- if (boot_cpu_data.x86 > 3 && map_type != _DRM_AGP) {
- pgprot_val(tmp) |= _PAGE_PCD;
- pgprot_val(tmp) &= ~_PAGE_PWT;
+ if (map->type != _DRM_AGP) {
+ if (map->type == _DRM_FRAME_BUFFER ||
+ map->flags & _DRM_WRITE_COMBINING)
+ tmp = pgprot_writecombine(tmp);
+ else
+ tmp = pgprot_noncached(tmp);
}
#elif defined(__powerpc__)
pgprot_val(tmp) |= _PAGE_NO_CACHE;
@@ -250,13 +254,8 @@ static void drm_vm_shm_close(struct vm_area_struct *vma)
switch (map->type) {
case _DRM_REGISTERS:
case _DRM_FRAME_BUFFER:
- if (drm_core_has_MTRR(dev) && map->mtrr >= 0) {
- int retcode;
- retcode = mtrr_del(map->mtrr,
- map->offset,
- map->size);
- DRM_DEBUG("mtrr_del = %d\n", retcode);
- }
+ if (drm_core_has_MTRR(dev))
+ arch_phys_wc_del(map->mtrr);
iounmap(map->handle);
break;
case _DRM_SHM:
@@ -617,7 +616,7 @@ int drm_mmap_locked(struct file *filp, struct vm_area_struct *vma)
case _DRM_REGISTERS:
offset = drm_core_get_reg_ofs(dev);
vma->vm_flags |= VM_IO; /* not in core dump */
- vma->vm_page_prot = drm_io_prot(map->type, vma);
+ vma->vm_page_prot = drm_io_prot(map, vma);
if (io_remap_pfn_range(vma, vma->vm_start,
(map->offset + offset) >> PAGE_SHIFT,
vma->vm_end - vma->vm_start,
--
1.8.1.4

2013-05-14 00:00:54

by Andy Lutomirski

[permalink] [raw]
Subject: [PATCH v3 5/9] i915: Use arch_phys_wc_{add,del}

i915 open-coded logic that was essentially equivalent to the new API.

Reviewed-by: Daniel Vetter <[email protected]>
Signed-off-by: Andy Lutomirski <[email protected]>
---

Changes from v1: Don't zero the mtrr handle after freeing it

drivers/gpu/drm/i915/i915_dma.c | 42 ++++-------------------------------------
1 file changed, 4 insertions(+), 38 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 4fa6beb..b0bb381 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -42,7 +42,6 @@
#include <linux/vga_switcheroo.h>
#include <linux/slab.h>
#include <acpi/video.h>
-#include <asm/pat.h>

#define LP_RING(d) (&((struct drm_i915_private *)(d))->ring[RCS])

@@ -1393,29 +1392,6 @@ void i915_master_destroy(struct drm_device *dev, struct drm_master *master)
master->driver_priv = NULL;
}

-static void
-i915_mtrr_setup(struct drm_i915_private *dev_priv, unsigned long base,
- unsigned long size)
-{
- dev_priv->mm.gtt_mtrr = -1;
-
-#if defined(CONFIG_X86_PAT)
- if (cpu_has_pat)
- return;
-#endif
-
- /* Set up a WC MTRR for non-PAT systems. This is more common than
- * one would think, because the kernel disables PAT on first
- * generation Core chips because WC PAT gets overridden by a UC
- * MTRR if present. Even if a UC MTRR isn't present.
- */
- dev_priv->mm.gtt_mtrr = mtrr_add(base, size, MTRR_TYPE_WRCOMB, 1);
- if (dev_priv->mm.gtt_mtrr < 0) {
- DRM_INFO("MTRR allocation failed. Graphics "
- "performance may suffer.\n");
- }
-}
-
static void i915_kick_out_firmware_fb(struct drm_i915_private *dev_priv)
{
struct apertures_struct *ap;
@@ -1552,8 +1528,8 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
goto out_rmmap;
}

- i915_mtrr_setup(dev_priv, dev_priv->gtt.mappable_base,
- aperture_size);
+ dev_priv->mm.gtt_mtrr = arch_phys_wc_add(dev_priv->gtt.mappable_base,
+ aperture_size);

/* The i915 workqueue is primarily used for batched retirement of
* requests (and thus managing bo) once the task has been completed
@@ -1656,12 +1632,7 @@ out_gem_unload:
intel_teardown_mchbar(dev);
destroy_workqueue(dev_priv->wq);
out_mtrrfree:
- if (dev_priv->mm.gtt_mtrr >= 0) {
- mtrr_del(dev_priv->mm.gtt_mtrr,
- dev_priv->gtt.mappable_base,
- aperture_size);
- dev_priv->mm.gtt_mtrr = -1;
- }
+ arch_phys_wc_del(dev_priv->mm.gtt_mtrr);
io_mapping_free(dev_priv->gtt.mappable);
out_rmmap:
pci_iounmap(dev->pdev, dev_priv->regs);
@@ -1697,12 +1668,7 @@ int i915_driver_unload(struct drm_device *dev)
cancel_delayed_work_sync(&dev_priv->mm.retire_work);

io_mapping_free(dev_priv->gtt.mappable);
- if (dev_priv->mm.gtt_mtrr >= 0) {
- mtrr_del(dev_priv->mm.gtt_mtrr,
- dev_priv->gtt.mappable_base,
- dev_priv->gtt.mappable_end);
- dev_priv->mm.gtt_mtrr = -1;
- }
+ arch_phys_wc_del(dev_priv->mm.gtt_mtrr);

acpi_video_unregister();

--
1.8.1.4

2013-05-14 00:01:20

by Andy Lutomirski

[permalink] [raw]
Subject: [PATCH v3 2/9] drm (ast,cirrus,mgag200,nouveau,savage,vmwgfx): Remove drm_mtrr_{add,del}

This replaces drm_mtrr_{add,del} with arch_phys_wc_{add,del}. The
interface is simplified (because the base and size parameters to
drm_mtrr_del never did anything), and it no longer adds MTRRs on
systems that don't need them.

Reviewed-by: Daniel Vetter <[email protected]>
Signed-off-by: Andy Lutomirski <[email protected]>
---
drivers/gpu/drm/ast/ast_ttm.c | 13 +++--------
drivers/gpu/drm/cirrus/cirrus_ttm.c | 15 ++++--------
drivers/gpu/drm/mgag200/mgag200_ttm.c | 14 ++++--------
drivers/gpu/drm/nouveau/nouveau_ttm.c | 13 ++++-------
drivers/gpu/drm/savage/savage_bci.c | 43 ++++++++++++-----------------------
drivers/gpu/drm/savage/savage_drv.h | 5 +---
drivers/gpu/drm/vmwgfx/vmwgfx_drv.c | 10 ++++----
include/drm/drmP.h | 29 -----------------------
8 files changed, 35 insertions(+), 107 deletions(-)

diff --git a/drivers/gpu/drm/ast/ast_ttm.c b/drivers/gpu/drm/ast/ast_ttm.c
index 3602731..c4574fd 100644
--- a/drivers/gpu/drm/ast/ast_ttm.c
+++ b/drivers/gpu/drm/ast/ast_ttm.c
@@ -271,26 +271,19 @@ int ast_mm_init(struct ast_private *ast)
return ret;
}

- ast->fb_mtrr = drm_mtrr_add(pci_resource_start(dev->pdev, 0),
- pci_resource_len(dev->pdev, 0),
- DRM_MTRR_WC);
+ ast->fb_mtrr = arch_phys_wc_add(pci_resource_start(dev->pdev, 0),
+ pci_resource_len(dev->pdev, 0));

return 0;
}

void ast_mm_fini(struct ast_private *ast)
{
- struct drm_device *dev = ast->dev;
ttm_bo_device_release(&ast->ttm.bdev);

ast_ttm_global_release(ast);

- if (ast->fb_mtrr >= 0) {
- drm_mtrr_del(ast->fb_mtrr,
- pci_resource_start(dev->pdev, 0),
- pci_resource_len(dev->pdev, 0), DRM_MTRR_WC);
- ast->fb_mtrr = -1;
- }
+ arch_phys_wc_del(ast->fb_mtrr);
}

void ast_ttm_placement(struct ast_bo *bo, int domain)
diff --git a/drivers/gpu/drm/cirrus/cirrus_ttm.c b/drivers/gpu/drm/cirrus/cirrus_ttm.c
index 1413a26..09f06d3 100644
--- a/drivers/gpu/drm/cirrus/cirrus_ttm.c
+++ b/drivers/gpu/drm/cirrus/cirrus_ttm.c
@@ -271,9 +271,8 @@ int cirrus_mm_init(struct cirrus_device *cirrus)
return ret;
}

- cirrus->fb_mtrr = drm_mtrr_add(pci_resource_start(dev->pdev, 0),
- pci_resource_len(dev->pdev, 0),
- DRM_MTRR_WC);
+ cirrus->fb_mtrr = arch_phys_wc_add(pci_resource_start(dev->pdev, 0),
+ pci_resource_len(dev->pdev, 0));

cirrus->mm_inited = true;
return 0;
@@ -281,8 +280,6 @@ int cirrus_mm_init(struct cirrus_device *cirrus)

void cirrus_mm_fini(struct cirrus_device *cirrus)
{
- struct drm_device *dev = cirrus->dev;
-
if (!cirrus->mm_inited)
return;

@@ -290,12 +287,8 @@ void cirrus_mm_fini(struct cirrus_device *cirrus)

cirrus_ttm_global_release(cirrus);

- if (cirrus->fb_mtrr >= 0) {
- drm_mtrr_del(cirrus->fb_mtrr,
- pci_resource_start(dev->pdev, 0),
- pci_resource_len(dev->pdev, 0), DRM_MTRR_WC);
- cirrus->fb_mtrr = -1;
- }
+ arch_phys_wc_del(cirrus->fb_mtrr);
+ cirrus->fb_mtrr = 0;
}

void cirrus_ttm_placement(struct cirrus_bo *bo, int domain)
diff --git a/drivers/gpu/drm/mgag200/mgag200_ttm.c b/drivers/gpu/drm/mgag200/mgag200_ttm.c
index 8fc9d92..5c6f3c8 100644
--- a/drivers/gpu/drm/mgag200/mgag200_ttm.c
+++ b/drivers/gpu/drm/mgag200/mgag200_ttm.c
@@ -270,26 +270,20 @@ int mgag200_mm_init(struct mga_device *mdev)
return ret;
}

- mdev->fb_mtrr = drm_mtrr_add(pci_resource_start(dev->pdev, 0),
- pci_resource_len(dev->pdev, 0),
- DRM_MTRR_WC);
+ mdev->fb_mtrr = arch_phys_wc_add(pci_resource_start(dev->pdev, 0),
+ pci_resource_len(dev->pdev, 0));

return 0;
}

void mgag200_mm_fini(struct mga_device *mdev)
{
- struct drm_device *dev = mdev->dev;
ttm_bo_device_release(&mdev->ttm.bdev);

mgag200_ttm_global_release(mdev);

- if (mdev->fb_mtrr >= 0) {
- drm_mtrr_del(mdev->fb_mtrr,
- pci_resource_start(dev->pdev, 0),
- pci_resource_len(dev->pdev, 0), DRM_MTRR_WC);
- mdev->fb_mtrr = -1;
- }
+ arch_phys_wc_del(mdev->fb_mtrr);
+ mdev->fb_mtrr = 0;
}

void mgag200_ttm_placement(struct mgag200_bo *bo, int domain)
diff --git a/drivers/gpu/drm/nouveau/nouveau_ttm.c b/drivers/gpu/drm/nouveau/nouveau_ttm.c
index 9be9cb5..63fa6a5 100644
--- a/drivers/gpu/drm/nouveau/nouveau_ttm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_ttm.c
@@ -377,9 +377,8 @@ nouveau_ttm_init(struct nouveau_drm *drm)
return ret;
}

- drm->ttm.mtrr = drm_mtrr_add(pci_resource_start(dev->pdev, 1),
- pci_resource_len(dev->pdev, 1),
- DRM_MTRR_WC);
+ drm->ttm.mtrr = arch_phys_wc_add(pci_resource_start(dev->pdev, 1),
+ pci_resource_len(dev->pdev, 1));

/* GART init */
if (drm->agp.stat != ENABLED) {
@@ -414,10 +413,6 @@ nouveau_ttm_fini(struct nouveau_drm *drm)

nouveau_ttm_global_release(drm);

- if (drm->ttm.mtrr >= 0) {
- drm_mtrr_del(drm->ttm.mtrr,
- pci_resource_start(drm->dev->pdev, 1),
- pci_resource_len(drm->dev->pdev, 1), DRM_MTRR_WC);
- drm->ttm.mtrr = -1;
- }
+ arch_phys_wc_del(drm->ttm.mtrr);
+ drm->ttm.mtrr = 0;
}
diff --git a/drivers/gpu/drm/savage/savage_bci.c b/drivers/gpu/drm/savage/savage_bci.c
index b55c1d6..bd6b2cf 100644
--- a/drivers/gpu/drm/savage/savage_bci.c
+++ b/drivers/gpu/drm/savage/savage_bci.c
@@ -570,9 +570,6 @@ int savage_driver_firstopen(struct drm_device *dev)
unsigned int fb_rsrc, aper_rsrc;
int ret = 0;

- dev_priv->mtrr[0].handle = -1;
- dev_priv->mtrr[1].handle = -1;
- dev_priv->mtrr[2].handle = -1;
if (S3_SAVAGE3D_SERIES(dev_priv->chipset)) {
fb_rsrc = 0;
fb_base = pci_resource_start(dev->pdev, 0);
@@ -584,21 +581,14 @@ int savage_driver_firstopen(struct drm_device *dev)
if (pci_resource_len(dev->pdev, 0) == 0x08000000) {
/* Don't make MMIO write-cobining! We need 3
* MTRRs. */
- dev_priv->mtrr[0].base = fb_base;
- dev_priv->mtrr[0].size = 0x01000000;
- dev_priv->mtrr[0].handle =
- drm_mtrr_add(dev_priv->mtrr[0].base,
- dev_priv->mtrr[0].size, DRM_MTRR_WC);
- dev_priv->mtrr[1].base = fb_base + 0x02000000;
- dev_priv->mtrr[1].size = 0x02000000;
- dev_priv->mtrr[1].handle =
- drm_mtrr_add(dev_priv->mtrr[1].base,
- dev_priv->mtrr[1].size, DRM_MTRR_WC);
- dev_priv->mtrr[2].base = fb_base + 0x04000000;
- dev_priv->mtrr[2].size = 0x04000000;
- dev_priv->mtrr[2].handle =
- drm_mtrr_add(dev_priv->mtrr[2].base,
- dev_priv->mtrr[2].size, DRM_MTRR_WC);
+ dev_priv->mtrr_handles[0] =
+ arch_phys_wc_add(fb_base, 0x01000000);
+ dev_priv->mtrr_handles[1] =
+ arch_phys_wc_add(fb_base + 0x02000000,
+ 0x02000000);
+ dev_priv->mtrr_handles[2] =
+ arch_phys_wc_add(fb_base + 0x04000000,
+ 0x04000000);
} else {
DRM_ERROR("strange pci_resource_len %08llx\n",
(unsigned long long)
@@ -616,11 +606,9 @@ int savage_driver_firstopen(struct drm_device *dev)
if (pci_resource_len(dev->pdev, 1) == 0x08000000) {
/* Can use one MTRR to cover both fb and
* aperture. */
- dev_priv->mtrr[0].base = fb_base;
- dev_priv->mtrr[0].size = 0x08000000;
- dev_priv->mtrr[0].handle =
- drm_mtrr_add(dev_priv->mtrr[0].base,
- dev_priv->mtrr[0].size, DRM_MTRR_WC);
+ dev_priv->mtrr_handles[0] =
+ arch_phys_wc_add(fb_base,
+ 0x08000000);
} else {
DRM_ERROR("strange pci_resource_len %08llx\n",
(unsigned long long)
@@ -660,11 +648,10 @@ void savage_driver_lastclose(struct drm_device *dev)
drm_savage_private_t *dev_priv = dev->dev_private;
int i;

- for (i = 0; i < 3; ++i)
- if (dev_priv->mtrr[i].handle >= 0)
- drm_mtrr_del(dev_priv->mtrr[i].handle,
- dev_priv->mtrr[i].base,
- dev_priv->mtrr[i].size, DRM_MTRR_WC);
+ for (i = 0; i < 3; ++i) {
+ arch_phys_wc_del(dev_priv->mtrr_handles[i]);
+ dev_priv->mtrr_handles[i] = 0;
+ }
}

int savage_driver_unload(struct drm_device *dev)
diff --git a/drivers/gpu/drm/savage/savage_drv.h b/drivers/gpu/drm/savage/savage_drv.h
index df2aac6..c05082a 100644
--- a/drivers/gpu/drm/savage/savage_drv.h
+++ b/drivers/gpu/drm/savage/savage_drv.h
@@ -160,10 +160,7 @@ typedef struct drm_savage_private {
drm_local_map_t *cmd_dma;
drm_local_map_t fake_dma;

- struct {
- int handle;
- unsigned long base, size;
- } mtrr[3];
+ int mtrr_handles[3];

/* BCI and status-related stuff */
volatile uint32_t *status_ptr, *bci_ptr;
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
index 07dfd82..78e2164 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
@@ -565,8 +565,8 @@ static int vmw_driver_load(struct drm_device *dev, unsigned long chipset)
dev_priv->has_gmr = false;
}

- dev_priv->mmio_mtrr = drm_mtrr_add(dev_priv->mmio_start,
- dev_priv->mmio_size, DRM_MTRR_WC);
+ dev_priv->mmio_mtrr = arch_phys_wc_add(dev_priv->mmio_start,
+ dev_priv->mmio_size);

dev_priv->mmio_virt = ioremap_wc(dev_priv->mmio_start,
dev_priv->mmio_size);
@@ -664,8 +664,7 @@ out_no_device:
out_err4:
iounmap(dev_priv->mmio_virt);
out_err3:
- drm_mtrr_del(dev_priv->mmio_mtrr, dev_priv->mmio_start,
- dev_priv->mmio_size, DRM_MTRR_WC);
+ arch_phys_wc_del(dev_priv->mmio_mtrr);
if (dev_priv->has_gmr)
(void) ttm_bo_clean_mm(&dev_priv->bdev, VMW_PL_GMR);
(void)ttm_bo_clean_mm(&dev_priv->bdev, TTM_PL_VRAM);
@@ -709,8 +708,7 @@ static int vmw_driver_unload(struct drm_device *dev)

ttm_object_device_release(&dev_priv->tdev);
iounmap(dev_priv->mmio_virt);
- drm_mtrr_del(dev_priv->mmio_mtrr, dev_priv->mmio_start,
- dev_priv->mmio_size, DRM_MTRR_WC);
+ arch_phys_wc_del(dev_priv->mmio_mtrr);
if (dev_priv->has_gmr)
(void)ttm_bo_clean_mm(&dev_priv->bdev, VMW_PL_GMR);
(void)ttm_bo_clean_mm(&dev_priv->bdev, TTM_PL_VRAM);
diff --git a/include/drm/drmP.h b/include/drm/drmP.h
index 2d94d74..3e6cfa0 100644
--- a/include/drm/drmP.h
+++ b/include/drm/drmP.h
@@ -1249,37 +1249,8 @@ static inline int drm_core_has_MTRR(struct drm_device *dev)
{
return drm_core_check_feature(dev, DRIVER_USE_MTRR);
}
-
-#define DRM_MTRR_WC MTRR_TYPE_WRCOMB
-
-static inline int drm_mtrr_add(unsigned long offset, unsigned long size,
- unsigned int flags)
-{
- return mtrr_add(offset, size, flags, 1);
-}
-
-static inline int drm_mtrr_del(int handle, unsigned long offset,
- unsigned long size, unsigned int flags)
-{
- return mtrr_del(handle, offset, size);
-}
-
#else
#define drm_core_has_MTRR(dev) (0)
-
-#define DRM_MTRR_WC 0
-
-static inline int drm_mtrr_add(unsigned long offset, unsigned long size,
- unsigned int flags)
-{
- return 0;
-}
-
-static inline int drm_mtrr_del(int handle, unsigned long offset,
- unsigned long size, unsigned int flags)
-{
- return 0;
-}
#endif

static inline void drm_device_set_unplugged(struct drm_device *dev)
--
1.8.1.4

2013-05-14 12:58:18

by Alex Deucher

[permalink] [raw]
Subject: Re: [PATCH v3 6/9] radeon: Switch to arch_phys_wc_add and add a missing ..._del

On Mon, May 13, 2013 at 7:58 PM, Andy Lutomirski <[email protected]> wrote:
> Reviewed-by: Daniel Vetter <[email protected]>
> Signed-off-by: Andy Lutomirski <[email protected]>

Reviewed-by: Alex Deucher <[email protected]>

> ---
> drivers/gpu/drm/radeon/radeon_object.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/radeon/radeon_object.c b/drivers/gpu/drm/radeon/radeon_object.c
> index d3aface..15cd34b 100644
> --- a/drivers/gpu/drm/radeon/radeon_object.c
> +++ b/drivers/gpu/drm/radeon/radeon_object.c
> @@ -321,8 +321,8 @@ void radeon_bo_force_delete(struct radeon_device *rdev)
> int radeon_bo_init(struct radeon_device *rdev)
> {
> /* Add an MTRR for the VRAM */
> - rdev->mc.vram_mtrr = mtrr_add(rdev->mc.aper_base, rdev->mc.aper_size,
> - MTRR_TYPE_WRCOMB, 1);
> + rdev->mc.vram_mtrr = arch_phys_wc_add(rdev->mc.aper_base,
> + rdev->mc.aper_size);
> DRM_INFO("Detected VRAM RAM=%lluM, BAR=%lluM\n",
> rdev->mc.mc_vram_size >> 20,
> (unsigned long long)rdev->mc.aper_size >> 20);
> @@ -334,6 +334,7 @@ int radeon_bo_init(struct radeon_device *rdev)
> void radeon_bo_fini(struct radeon_device *rdev)
> {
> radeon_ttm_fini(rdev);
> + arch_phys_wc_del(rdev->mc.vram_mtrr);
> }
>
> void radeon_bo_list_add_object(struct radeon_bo_list *lobj,
> --
> 1.8.1.4
>

2013-05-14 13:37:42

by Jerome Glisse

[permalink] [raw]
Subject: Re: [PATCH v3 6/9] radeon: Switch to arch_phys_wc_add and add a missing ..._del

On Tue, May 14, 2013 at 8:58 AM, Alex Deucher <[email protected]> wrote:
> On Mon, May 13, 2013 at 7:58 PM, Andy Lutomirski <[email protected]> wrote:
>> Reviewed-by: Daniel Vetter <[email protected]>
>> Signed-off-by: Andy Lutomirski <[email protected]>
>
> Reviewed-by: Alex Deucher <[email protected]>

I believe it will break something but we could deal with the fallout
once it happens.

>> ---
>> drivers/gpu/drm/radeon/radeon_object.c | 5 +++--
>> 1 file changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/radeon/radeon_object.c b/drivers/gpu/drm/radeon/radeon_object.c
>> index d3aface..15cd34b 100644
>> --- a/drivers/gpu/drm/radeon/radeon_object.c
>> +++ b/drivers/gpu/drm/radeon/radeon_object.c
>> @@ -321,8 +321,8 @@ void radeon_bo_force_delete(struct radeon_device *rdev)
>> int radeon_bo_init(struct radeon_device *rdev)
>> {
>> /* Add an MTRR for the VRAM */
>> - rdev->mc.vram_mtrr = mtrr_add(rdev->mc.aper_base, rdev->mc.aper_size,
>> - MTRR_TYPE_WRCOMB, 1);
>> + rdev->mc.vram_mtrr = arch_phys_wc_add(rdev->mc.aper_base,
>> + rdev->mc.aper_size);
>> DRM_INFO("Detected VRAM RAM=%lluM, BAR=%lluM\n",
>> rdev->mc.mc_vram_size >> 20,
>> (unsigned long long)rdev->mc.aper_size >> 20);
>> @@ -334,6 +334,7 @@ int radeon_bo_init(struct radeon_device *rdev)
>> void radeon_bo_fini(struct radeon_device *rdev)
>> {
>> radeon_ttm_fini(rdev);
>> + arch_phys_wc_del(rdev->mc.vram_mtrr);
>> }
>>
>> void radeon_bo_list_add_object(struct radeon_bo_list *lobj,
>> --
>> 1.8.1.4
>>

2013-05-14 21:35:29

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH v3 6/9] radeon: Switch to arch_phys_wc_add and add a missing ..._del

On Tue, May 14, 2013 at 6:37 AM, Jerome Glisse <[email protected]> wrote:
> On Tue, May 14, 2013 at 8:58 AM, Alex Deucher <[email protected]> wrote:
>> On Mon, May 13, 2013 at 7:58 PM, Andy Lutomirski <[email protected]> wrote:
>>> Reviewed-by: Daniel Vetter <[email protected]>
>>> Signed-off-by: Andy Lutomirski <[email protected]>
>>
>> Reviewed-by: Alex Deucher <[email protected]>
>
> I believe it will break something but we could deal with the fallout
> once it happens.

FWIW, I'm running with this code on my machine right now using the
radeon driver. Everything seems fine. If I build without MTRRs and
without PAT, though, graphics are slow (as expected). So I think
everything's okay.

--Andy

2013-05-15 14:49:41

by Jerome Glisse

[permalink] [raw]
Subject: Re: [PATCH v3 6/9] radeon: Switch to arch_phys_wc_add and add a missing ..._del

On Tue, May 14, 2013 at 5:35 PM, Andy Lutomirski <[email protected]> wrote:
> On Tue, May 14, 2013 at 6:37 AM, Jerome Glisse <[email protected]> wrote:
>> On Tue, May 14, 2013 at 8:58 AM, Alex Deucher <[email protected]> wrote:
>>> On Mon, May 13, 2013 at 7:58 PM, Andy Lutomirski <[email protected]> wrote:
>>>> Reviewed-by: Daniel Vetter <[email protected]>
>>>> Signed-off-by: Andy Lutomirski <[email protected]>
>>>
>>> Reviewed-by: Alex Deucher <[email protected]>
>>
>> I believe it will break something but we could deal with the fallout
>> once it happens.
>
> FWIW, I'm running with this code on my machine right now using the
> radeon driver. Everything seems fine. If I build without MTRRs and
> without PAT, though, graphics are slow (as expected). So I think
> everything's okay.
>
> --Andy

I am worried on p4 where i last see issue with that notably with agp.

Cheers,
Jerome

2013-05-15 18:22:44

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH v3 6/9] radeon: Switch to arch_phys_wc_add and add a missing ..._del

On Wed, May 15, 2013 at 7:49 AM, Jerome Glisse <[email protected]> wrote:
> On Tue, May 14, 2013 at 5:35 PM, Andy Lutomirski <[email protected]> wrote:
>> On Tue, May 14, 2013 at 6:37 AM, Jerome Glisse <[email protected]> wrote:
>>> On Tue, May 14, 2013 at 8:58 AM, Alex Deucher <[email protected]> wrote:
>>>> On Mon, May 13, 2013 at 7:58 PM, Andy Lutomirski <[email protected]> wrote:
>>>>> Reviewed-by: Daniel Vetter <[email protected]>
>>>>> Signed-off-by: Andy Lutomirski <[email protected]>
>>>>
>>>> Reviewed-by: Alex Deucher <[email protected]>
>>>
>>> I believe it will break something but we could deal with the fallout
>>> once it happens.
>>
>> FWIW, I'm running with this code on my machine right now using the
>> radeon driver. Everything seems fine. If I build without MTRRs and
>> without PAT, though, graphics are slow (as expected). So I think
>> everything's okay.
>>
>> --Andy
>
> I am worried on p4 where i last see issue with that notably with agp.

Do you remember any details? It looks like PAT is enabled on Pentium
4 (i.e. famliy 0xF).

--Andy

2013-05-16 13:50:05

by Jerome Glisse

[permalink] [raw]
Subject: Re: [PATCH v3 6/9] radeon: Switch to arch_phys_wc_add and add a missing ..._del

On Wed, May 15, 2013 at 2:22 PM, Andy Lutomirski <[email protected]> wrote:
> On Wed, May 15, 2013 at 7:49 AM, Jerome Glisse <[email protected]> wrote:
>> On Tue, May 14, 2013 at 5:35 PM, Andy Lutomirski <[email protected]> wrote:
>>> On Tue, May 14, 2013 at 6:37 AM, Jerome Glisse <[email protected]> wrote:
>>>> On Tue, May 14, 2013 at 8:58 AM, Alex Deucher <[email protected]> wrote:
>>>>> On Mon, May 13, 2013 at 7:58 PM, Andy Lutomirski <[email protected]> wrote:
>>>>>> Reviewed-by: Daniel Vetter <[email protected]>
>>>>>> Signed-off-by: Andy Lutomirski <[email protected]>
>>>>>
>>>>> Reviewed-by: Alex Deucher <[email protected]>
>>>>
>>>> I believe it will break something but we could deal with the fallout
>>>> once it happens.
>>>
>>> FWIW, I'm running with this code on my machine right now using the
>>> radeon driver. Everything seems fine. If I build without MTRRs and
>>> without PAT, though, graphics are slow (as expected). So I think
>>> everything's okay.
>>>
>>> --Andy
>>
>> I am worried on p4 where i last see issue with that notably with agp.
>
> Do you remember any details? It looks like PAT is enabled on Pentium
> 4 (i.e. famliy 0xF).
>
> --Andy

No i don't, i think it was some pat errata on those about non real ram
address and with agp. Memory is fuzzy. I might have time in couple of
week to plug back my p4 and see how it behave.

Cheers,
Jerome

2013-05-16 21:00:53

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH v3 6/9] radeon: Switch to arch_phys_wc_add and add a missing ..._del

On Thu, May 16, 2013 at 6:50 AM, Jerome Glisse <[email protected]> wrote:
> On Wed, May 15, 2013 at 2:22 PM, Andy Lutomirski <[email protected]> wrote:
>> On Wed, May 15, 2013 at 7:49 AM, Jerome Glisse <[email protected]> wrote:
>>> On Tue, May 14, 2013 at 5:35 PM, Andy Lutomirski <[email protected]> wrote:
>>>> On Tue, May 14, 2013 at 6:37 AM, Jerome Glisse <[email protected]> wrote:
>>>>> On Tue, May 14, 2013 at 8:58 AM, Alex Deucher <[email protected]> wrote:
>>>>>> On Mon, May 13, 2013 at 7:58 PM, Andy Lutomirski <[email protected]> wrote:
>>>>>>> Reviewed-by: Daniel Vetter <[email protected]>
>>>>>>> Signed-off-by: Andy Lutomirski <[email protected]>
>>>>>>
>>>>>> Reviewed-by: Alex Deucher <[email protected]>
>>>>>
>>>>> I believe it will break something but we could deal with the fallout
>>>>> once it happens.
>>>>
>>>> FWIW, I'm running with this code on my machine right now using the
>>>> radeon driver. Everything seems fine. If I build without MTRRs and
>>>> without PAT, though, graphics are slow (as expected). So I think
>>>> everything's okay.
>>>>
>>>> --Andy
>>>
>>> I am worried on p4 where i last see issue with that notably with agp.
>>
>> Do you remember any details? It looks like PAT is enabled on Pentium
>> 4 (i.e. famliy 0xF).
>>
>> --Andy
>
> No i don't, i think it was some pat errata on those about non real ram
> address and with agp. Memory is fuzzy. I might have time in couple of
> week to plug back my p4 and see how it behave.

Hmm. I couldn't find any PAT errata related to AGP.

Is it possible that the issue was that, if there was no MTRR covering
the AGP aperture, that the mapping ended up as writeback? If so, I
fixed that in this series.

In any case, if you see any problems, please let me know.

--Andy

2013-05-23 18:35:39

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH v3 0/9] Clean up write-combining MTRR addition

On Mon, May 13, 2013 at 4:58 PM, Andy Lutomirski <[email protected]> wrote:
> A fair number of drivers (mostly graphics) add write-combining MTRRs.
> Most ignore errors and most add the MTRR even on PAT systems which don't
> need to use MTRRs.
>
> This series adds new functions arch_phys_wc_{add,del} that, on PAT-less
> x86 systems with MTRRs, add MTRRs and report errors, and that do nothing
> otherwise. (Other architectures, if any, with a similar mechanism could
> implement them.)

That's the path to upstream for this? Should it go through drm-next?
(Sorry for possible noob question -- I've never sent in anything other
than trivial fixes to drm stuff before.)

--Andy

2013-05-31 03:17:01

by Dave Airlie

[permalink] [raw]
Subject: Re: [PATCH v3 0/9] Clean up write-combining MTRR addition

On Fri, May 24, 2013 at 4:35 AM, Andy Lutomirski <[email protected]> wrote:
> On Mon, May 13, 2013 at 4:58 PM, Andy Lutomirski <[email protected]> wrote:
>> A fair number of drivers (mostly graphics) add write-combining MTRRs.
>> Most ignore errors and most add the MTRR even on PAT systems which don't
>> need to use MTRRs.
>>
>> This series adds new functions arch_phys_wc_{add,del} that, on PAT-less
>> x86 systems with MTRRs, add MTRRs and report errors, and that do nothing
>> otherwise. (Other architectures, if any, with a similar mechanism could
>> implement them.)
>
> That's the path to upstream for this? Should it go through drm-next?
> (Sorry for possible noob question -- I've never sent in anything other
> than trivial fixes to drm stuff before.)

I've pulled the v3 series into drm-next, lets see how they go for a while,

I suppose I should try and boot an AGP box with them.

Dave.

2013-05-31 03:47:33

by Dave Airlie

[permalink] [raw]
Subject: Re: [PATCH v3 3/9] drm: Update drm_addmap and drm_mmap to use PAT WC instead of MTRRs

On Tue, May 14, 2013 at 9:58 AM, Andy Lutomirski <[email protected]> wrote:
> Previously, DRM_FRAME_BUFFER mappings, as well as DRM_REGISTERS
> mappings with DRM_WRITE_COMBINING set, resulted in an unconditional
> MTRR being added but the actual mappings being created as UC-.
>
> Now these mappings have the MTRR added only if needed, but they will
> be mapped with pgprot_writecombine.
>
> The non-WC DRM_REGISTERS case now uses pgprot_noncached instead of
> hardcoding the bit twiddling.
>
> The DRM_AGP case is unchanged for now.

Just FYI this breaks on powerpc build, I've fixed it up and pushed the
fixed version to
drm-next-staging, I'll push that to drm-next in a couple of days, once
kbuild robot hits it.

Dave.