2015-04-30 20:27:39

by Luis Chamberlain

[permalink] [raw]
Subject: [PATCH v5 0/6] x86: address drivers that do not work with PAT

From: "Luis R. Rodriguez" <[email protected]>

This v5 drops the addition of new early_param_*() helpers
and their use on pat_enabled as we are sticking with
__read_mostly, and as per review this should be selectively
used only on well established hot paths. pat_enabled turns
out to be a common hot path, so we want to keep that. This
v5 also changes the pr_info() patch slightly to address the
feedback. The other patches do not change at all.

Luis R. Rodriguez (6):
x86/mm/pat: use pr_info() and friends
x86/mm/pat: redefine pat_enabled
arch/x86/mm/pat: export pat_enabled()
ivtv: use arch_phys_wc_add() and require PAT disabled
IB/ipath: add counting for MTRR
IB/ipath: use arch_phys_wc_add() and require PAT disabled

arch/x86/include/asm/pat.h | 7 +--
arch/x86/kernel/cpu/mtrr/main.c | 2 +-
arch/x86/mm/iomap_32.c | 2 +-
arch/x86/mm/ioremap.c | 4 +-
arch/x86/mm/pageattr.c | 2 +-
arch/x86/mm/pat.c | 75 +++++++++++++--------------
arch/x86/mm/pat_internal.h | 2 +-
arch/x86/mm/pat_rbtree.c | 5 +-
arch/x86/pci/i386.c | 6 +--
drivers/infiniband/hw/ipath/Kconfig | 3 ++
drivers/infiniband/hw/ipath/ipath_driver.c | 18 +++++--
drivers/infiniband/hw/ipath/ipath_kernel.h | 4 +-
drivers/infiniband/hw/ipath/ipath_wc_x86_64.c | 43 ++++-----------
drivers/media/pci/ivtv/Kconfig | 3 ++
drivers/media/pci/ivtv/ivtvfb.c | 58 ++++++++-------------
15 files changed, 103 insertions(+), 131 deletions(-)

--
2.3.2.209.gd67f9d5.dirty


2015-04-30 20:29:49

by Luis Chamberlain

[permalink] [raw]
Subject: [PATCH v5 1/6] x86/mm/pat: use pr_info() and friends

From: "Luis R. Rodriguez" <[email protected]>

Use pr_info() instead of the old printk to
prefix the component where things are coming
from. With this readers will know exactly where
the message is coming from. Since pr_fmt is
already defined in this case we redefine it to
"PAT: ".

We leave the users of dprintk() in place, this
will print only when the debugpat kernel parameter
is enabled. We want to leave those enabled as a
debug feature, but also make them use the same
prefix.

Cc: Andy Walls <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: Dave Airlie <[email protected]>
Cc: Bjorn Helgaas <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Michael S. Tsirkin <[email protected]>
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Luis R. Rodriguez <[email protected]>
---
arch/x86/mm/pat.c | 41 +++++++++++++++++++++--------------------
arch/x86/mm/pat_internal.h | 2 +-
arch/x86/mm/pat_rbtree.c | 5 ++++-
3 files changed, 26 insertions(+), 22 deletions(-)

diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index 372ad42..8f88c6a 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -33,13 +33,16 @@
#include "pat_internal.h"
#include "mm_internal.h"

+#undef pr_fmt
+#define pr_fmt(fmt) "PAT: " fmt
+
#ifdef CONFIG_X86_PAT
int __read_mostly pat_enabled = 1;

static inline void pat_disable(const char *reason)
{
pat_enabled = 0;
- printk(KERN_INFO "%s\n", reason);
+ pr_info("%s\n", reason);
}

static int __init nopat(char *str)
@@ -211,8 +214,7 @@ void pat_init(void)
* switched to PAT on the boot CPU. We have no way to
* undo PAT.
*/
- printk(KERN_ERR "PAT enabled, "
- "but not supported by secondary CPU\n");
+ pr_err("PAT enabled, but not supported by secondary CPU\n");
BUG();
}
}
@@ -451,9 +453,9 @@ int reserve_memtype(u64 start, u64 end, enum page_cache_mode req_type,

err = rbt_memtype_check_insert(new, new_type);
if (err) {
- printk(KERN_INFO "reserve_memtype failed [mem %#010Lx-%#010Lx], track %s, req %s\n",
- start, end - 1,
- cattr_name(new->type), cattr_name(req_type));
+ pr_info("reserve_memtype failed [mem %#010Lx-%#010Lx], track %s, req %s\n",
+ start, end - 1,
+ cattr_name(new->type), cattr_name(req_type));
kfree(new);
spin_unlock(&memtype_lock);

@@ -497,8 +499,8 @@ int free_memtype(u64 start, u64 end)
spin_unlock(&memtype_lock);

if (!entry) {
- printk(KERN_INFO "%s:%d freeing invalid memtype [mem %#010Lx-%#010Lx]\n",
- current->comm, current->pid, start, end - 1);
+ pr_info("%s:%d freeing invalid memtype [mem %#010Lx-%#010Lx]\n",
+ current->comm, current->pid, start, end - 1);
return -EINVAL;
}

@@ -628,8 +630,8 @@ static inline int range_is_allowed(unsigned long pfn, unsigned long size)

while (cursor < to) {
if (!devmem_is_allowed(pfn)) {
- printk(KERN_INFO "Program %s tried to access /dev/mem between [mem %#010Lx-%#010Lx], PAT prevents it\n",
- current->comm, from, to - 1);
+ pr_info("Program %s tried to access /dev/mem between [mem %#010Lx-%#010Lx], PAT prevents it\n",
+ current->comm, from, to - 1);
return 0;
}
cursor += PAGE_SIZE;
@@ -698,8 +700,7 @@ int kernel_map_sync_memtype(u64 base, unsigned long size,
size;

if (ioremap_change_attr((unsigned long)__va(base), id_sz, pcm) < 0) {
- printk(KERN_INFO "%s:%d ioremap_change_attr failed %s "
- "for [mem %#010Lx-%#010Lx]\n",
+ pr_info("%s:%d ioremap_change_attr failed %s for [mem %#010Lx-%#010Lx]\n",
current->comm, current->pid,
cattr_name(pcm),
base, (unsigned long long)(base + size-1));
@@ -734,7 +735,7 @@ static int reserve_pfn_range(u64 paddr, unsigned long size, pgprot_t *vma_prot,

pcm = lookup_memtype(paddr);
if (want_pcm != pcm) {
- printk(KERN_WARNING "%s:%d map pfn RAM range req %s for [mem %#010Lx-%#010Lx], got %s\n",
+ pr_warn("%s:%d map pfn RAM range req %s for [mem %#010Lx-%#010Lx], got %s\n",
current->comm, current->pid,
cattr_name(want_pcm),
(unsigned long long)paddr,
@@ -755,13 +756,13 @@ static int reserve_pfn_range(u64 paddr, unsigned long size, pgprot_t *vma_prot,
if (strict_prot ||
!is_new_memtype_allowed(paddr, size, want_pcm, pcm)) {
free_memtype(paddr, paddr + size);
- printk(KERN_ERR "%s:%d map pfn expected mapping type %s"
- " for [mem %#010Lx-%#010Lx], got %s\n",
- current->comm, current->pid,
- cattr_name(want_pcm),
- (unsigned long long)paddr,
- (unsigned long long)(paddr + size - 1),
- cattr_name(pcm));
+ pr_err("%s:%d map pfn expected mapping type %s"
+ " for [mem %#010Lx-%#010Lx], got %s\n",
+ current->comm, current->pid,
+ cattr_name(want_pcm),
+ (unsigned long long)paddr,
+ (unsigned long long)(paddr + size - 1),
+ cattr_name(pcm));
return -EINVAL;
}
/*
diff --git a/arch/x86/mm/pat_internal.h b/arch/x86/mm/pat_internal.h
index f641162..ea7fbf0 100644
--- a/arch/x86/mm/pat_internal.h
+++ b/arch/x86/mm/pat_internal.h
@@ -4,7 +4,7 @@
extern int pat_debug_enable;

#define dprintk(fmt, arg...) \
- do { if (pat_debug_enable) printk(KERN_INFO fmt, ##arg); } while (0)
+ do { if (pat_debug_enable) pr_info(fmt, ##arg); } while (0)

struct memtype {
u64 start;
diff --git a/arch/x86/mm/pat_rbtree.c b/arch/x86/mm/pat_rbtree.c
index 6582adc..374539b 100644
--- a/arch/x86/mm/pat_rbtree.c
+++ b/arch/x86/mm/pat_rbtree.c
@@ -21,6 +21,9 @@

#include "pat_internal.h"

+#undef pr_fmt
+#define pr_fmt(fmt) "PAT: " fmt
+
/*
* The memtype tree keeps track of memory type for specific
* physical memory areas. Without proper tracking, conflicting memory
@@ -160,7 +163,7 @@ success:
return 0;

failure:
- printk(KERN_INFO "%s:%d conflicting memory types "
+ pr_info("%s:%d conflicting memory types "
"%Lx-%Lx %s<->%s\n", current->comm, current->pid, start,
end, cattr_name(found_type), cattr_name(match->type));
return -EBUSY;
--
2.3.2.209.gd67f9d5.dirty

2015-04-30 20:32:02

by Luis Chamberlain

[permalink] [raw]
Subject: [PATCH v5 2/6] x86/mm/pat: redefine pat_enabled

From: "Luis R. Rodriguez" <[email protected]>

We use pat_enabled on x86 specific code to see if PAT
is enabled or not, we however are granting full access to
the variable even though readers do not need to set it.
If for instance we granted access to it to modules later
they then could override the variable setting... no bueno.

This renames pat_enabled under a new static variable
__pat_enabled, to see if PAT is enabled / disabled folks
can just use pat_enabled() now. Code that sets this can
only be internal to pat.c. Apart from the early kernel
parameter "nopat" to disable PAT we also have a few
cases that disable it later and make use of a helper
pat_disable(), this helper is wrapped under an ifdef but
since that code cannot run unless PAT was enabled its not
required to wrap it with ifdefs, unwrap that. Likewise
since "nopat" doesn't really change non-PAT systems
just remove that ifdef as well.

Although we could add and use an early_param_off()
these helpers don't use __read_mostly and we want to
keep __read_mostly for __pat_enabled as this is a hot
path -- upon boot for instance a simple guest may see
~4k accesses to pat_enabled(). Since __read_mostly
early boot params are not that common we don't add a
helper for them just yet.

Cc: Borislav Petkov <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Kyle McMartin <[email protected]>
Cc: Andy Walls <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: Dave Airlie <[email protected]>
Cc: Bjorn Helgaas <[email protected]>
Cc: Michael S. Tsirkin <[email protected]>
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Luis R. Rodriguez <[email protected]>
---
arch/x86/include/asm/pat.h | 7 +------
arch/x86/kernel/cpu/mtrr/main.c | 2 +-
arch/x86/mm/iomap_32.c | 2 +-
arch/x86/mm/ioremap.c | 4 ++--
arch/x86/mm/pageattr.c | 2 +-
arch/x86/mm/pat.c | 33 +++++++++++++++------------------
arch/x86/pci/i386.c | 6 +++---
7 files changed, 24 insertions(+), 32 deletions(-)

diff --git a/arch/x86/include/asm/pat.h b/arch/x86/include/asm/pat.h
index 91bc4ba..cdcff7f 100644
--- a/arch/x86/include/asm/pat.h
+++ b/arch/x86/include/asm/pat.h
@@ -4,12 +4,7 @@
#include <linux/types.h>
#include <asm/pgtable_types.h>

-#ifdef CONFIG_X86_PAT
-extern int pat_enabled;
-#else
-static const int pat_enabled;
-#endif
-
+bool pat_enabled(void);
extern void pat_init(void);
void pat_init_cache_modes(void);

diff --git a/arch/x86/kernel/cpu/mtrr/main.c b/arch/x86/kernel/cpu/mtrr/main.c
index bfef424..f094d36 100644
--- a/arch/x86/kernel/cpu/mtrr/main.c
+++ b/arch/x86/kernel/cpu/mtrr/main.c
@@ -556,7 +556,7 @@ int arch_phys_wc_add(unsigned long base, unsigned long size)
{
int ret;

- if (pat_enabled || !mtrr_enabled)
+ if (pat_enabled() || !mtrr_enabled)
return 0; /* Success! (We don't need to do anything.) */

ret = mtrr_add(base, size, MTRR_TYPE_WRCOMB, true);
diff --git a/arch/x86/mm/iomap_32.c b/arch/x86/mm/iomap_32.c
index 9ca35fc..3a2ec87 100644
--- a/arch/x86/mm/iomap_32.c
+++ b/arch/x86/mm/iomap_32.c
@@ -82,7 +82,7 @@ iomap_atomic_prot_pfn(unsigned long pfn, pgprot_t prot)
* MTRR is UC or WC. UC_MINUS gets the real intention, of the
* user, which is "WC if the MTRR is WC, UC if you can't do that."
*/
- if (!pat_enabled && pgprot_val(prot) ==
+ if (!pat_enabled() && pgprot_val(prot) ==
(__PAGE_KERNEL | cachemode2protval(_PAGE_CACHE_MODE_WC)))
prot = __pgprot(__PAGE_KERNEL |
cachemode2protval(_PAGE_CACHE_MODE_UC_MINUS));
diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index a493bb8..82d63ed 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -234,7 +234,7 @@ void __iomem *ioremap_nocache(resource_size_t phys_addr, unsigned long size)
{
/*
* Ideally, this should be:
- * pat_enabled ? _PAGE_CACHE_MODE_UC : _PAGE_CACHE_MODE_UC_MINUS;
+ * pat_enabled() ? _PAGE_CACHE_MODE_UC : _PAGE_CACHE_MODE_UC_MINUS;
*
* Till we fix all X drivers to use ioremap_wc(), we will use
* UC MINUS. Drivers that are certain they need or can already
@@ -292,7 +292,7 @@ EXPORT_SYMBOL_GPL(ioremap_uc);
*/
void __iomem *ioremap_wc(resource_size_t phys_addr, unsigned long size)
{
- if (pat_enabled)
+ if (pat_enabled())
return __ioremap_caller(phys_addr, size, _PAGE_CACHE_MODE_WC,
__builtin_return_address(0));
else
diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index 49660c0..0aa8dd8 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -1574,7 +1574,7 @@ int set_memory_wc(unsigned long addr, int numpages)
{
int ret;

- if (!pat_enabled)
+ if (!pat_enabled())
return set_memory_uc(addr, numpages);

ret = reserve_memtype(__pa(addr), __pa(addr) + numpages * PAGE_SIZE,
diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index 8f88c6a..f64785e 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -36,12 +36,11 @@
#undef pr_fmt
#define pr_fmt(fmt) "PAT: " fmt

-#ifdef CONFIG_X86_PAT
-int __read_mostly pat_enabled = 1;
+static int __read_mostly __pat_enabled = IS_ENABLED(CONFIG_X86_PAT);

static inline void pat_disable(const char *reason)
{
- pat_enabled = 0;
+ __pat_enabled = false;
pr_info("%s\n", reason);
}

@@ -51,13 +50,11 @@ static int __init nopat(char *str)
return 0;
}
early_param("nopat", nopat);
-#else
-static inline void pat_disable(const char *reason)
+
+bool pat_enabled(void)
{
- (void)reason;
+ return !!__pat_enabled;
}
-#endif
-

int pat_debug_enable;

@@ -201,7 +198,7 @@ void pat_init(void)
u64 pat;
bool boot_cpu = !boot_pat_state;

- if (!pat_enabled)
+ if (!pat_enabled())
return;

if (!cpu_has_pat) {
@@ -402,7 +399,7 @@ int reserve_memtype(u64 start, u64 end, enum page_cache_mode req_type,

BUG_ON(start >= end); /* end is exclusive */

- if (!pat_enabled) {
+ if (!pat_enabled()) {
/* This is identical to page table setting without PAT */
if (new_type) {
if (req_type == _PAGE_CACHE_MODE_WC)
@@ -477,7 +474,7 @@ int free_memtype(u64 start, u64 end)
int is_range_ram;
struct memtype *entry;

- if (!pat_enabled)
+ if (!pat_enabled())
return 0;

/* Low ISA region is always mapped WB. No need to track */
@@ -625,7 +622,7 @@ static inline int range_is_allowed(unsigned long pfn, unsigned long size)
u64 to = from + size;
u64 cursor = from;

- if (!pat_enabled)
+ if (!pat_enabled())
return 1;

while (cursor < to) {
@@ -661,7 +658,7 @@ int phys_mem_access_prot_allowed(struct file *file, unsigned long pfn,
* caching for the high addresses through the KEN pin, but
* we maintain the tradition of paranoia in this code.
*/
- if (!pat_enabled &&
+ if (!pat_enabled() &&
!(boot_cpu_has(X86_FEATURE_MTRR) ||
boot_cpu_has(X86_FEATURE_K6_MTRR) ||
boot_cpu_has(X86_FEATURE_CYRIX_ARR) ||
@@ -730,7 +727,7 @@ static int reserve_pfn_range(u64 paddr, unsigned long size, pgprot_t *vma_prot,
* the type requested matches the type of first page in the range.
*/
if (is_ram) {
- if (!pat_enabled)
+ if (!pat_enabled())
return 0;

pcm = lookup_memtype(paddr);
@@ -845,7 +842,7 @@ int track_pfn_remap(struct vm_area_struct *vma, pgprot_t *prot,
return ret;
}

- if (!pat_enabled)
+ if (!pat_enabled())
return 0;

/*
@@ -873,7 +870,7 @@ int track_pfn_insert(struct vm_area_struct *vma, pgprot_t *prot,
{
enum page_cache_mode pcm;

- if (!pat_enabled)
+ if (!pat_enabled())
return 0;

/* Set prot based on lookup */
@@ -914,7 +911,7 @@ void untrack_pfn(struct vm_area_struct *vma, unsigned long pfn,

pgprot_t pgprot_writecombine(pgprot_t prot)
{
- if (pat_enabled)
+ if (pat_enabled())
return __pgprot(pgprot_val(prot) |
cachemode2protval(_PAGE_CACHE_MODE_WC));
else
@@ -997,7 +994,7 @@ static const struct file_operations memtype_fops = {

static int __init pat_memtype_list_init(void)
{
- if (pat_enabled) {
+ if (pat_enabled()) {
debugfs_create_file("pat_memtype_list", S_IRUSR,
arch_debugfs_dir, NULL, &memtype_fops);
}
diff --git a/arch/x86/pci/i386.c b/arch/x86/pci/i386.c
index 349c0d3..0a9f2ca 100644
--- a/arch/x86/pci/i386.c
+++ b/arch/x86/pci/i386.c
@@ -429,12 +429,12 @@ int pci_mmap_page_range(struct pci_dev *dev, struct vm_area_struct *vma,
* Caller can followup with UC MINUS request and add a WC mtrr if there
* is a free mtrr slot.
*/
- if (!pat_enabled && write_combine)
+ if (!pat_enabled() && write_combine)
return -EINVAL;

- if (pat_enabled && write_combine)
+ if (pat_enabled() && write_combine)
prot |= cachemode2protval(_PAGE_CACHE_MODE_WC);
- else if (pat_enabled || boot_cpu_data.x86 > 3)
+ else if (pat_enabled() || boot_cpu_data.x86 > 3)
/*
* ioremap() and ioremap_nocache() defaults to UC MINUS for now.
* To avoid attribute conflicts, request UC MINUS here
--
2.3.2.209.gd67f9d5.dirty

2015-04-30 20:34:13

by Luis Chamberlain

[permalink] [raw]
Subject: [PATCH v5 3/6] arch/x86/mm/pat: export pat_enabled()

From: "Luis R. Rodriguez" <[email protected]>

Two Linux device drivers cannot work with PAT and the work
required to make them work is significant. There is not
enough motivation to convert these drivers over to use
PAT properly, the compromise reached is to let drivers
that cannot be ported to PAT check if PAT was enabled
and if so fail on probe with a recommendation to boot
with the "nopat" kernel parameter.

Cc: Andy Walls <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: Dave Airlie <[email protected]>
Cc: Bjorn Helgaas <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Michael S. Tsirkin <[email protected]>
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Luis R. Rodriguez <[email protected]>
---
arch/x86/mm/pat.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index f64785e..3d60207 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -55,6 +55,7 @@ bool pat_enabled(void)
{
return !!__pat_enabled;
}
+EXPORT_SYMBOL_GPL(pat_enabled);

int pat_debug_enable;

--
2.3.2.209.gd67f9d5.dirty

2015-04-30 20:36:27

by Luis Chamberlain

[permalink] [raw]
Subject: [PATCH v5 4/6] ivtv: use arch_phys_wc_add() and require PAT disabled

From: "Luis R. Rodriguez" <[email protected]>

We are burrying direct access to MTRR code support on
x86 in order to take advantage of PAT. In the future we
also want to make the default behaviour of ioremap_nocache()
to use strong UC, use of mtrr_add() on those systems
would make write-combining void.

In order to help both enable us to later make strong
UC default and in order to phase out direct MTRR access
code port the driver over to arch_phys_wc_add() and
annotate that the device driver requires systems to
boot with PAT disabled, with the nopat kernel parameter.

This is a worthy comprmise given that the hardware is
really rare these days, and perhaps only some lost souls
in some third world country are expected to be using this
feature of the device driver.

Acked-by: Andy Walls <[email protected]>
Cc: Andy Walls <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Mauro Carvalho Chehab <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Suresh Siddha <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: Dave Airlie <[email protected]>
Cc: Bjorn Helgaas <[email protected]>
Cc: Antonino Daplas <[email protected]>
Cc: Jean-Christophe Plagniol-Villard <[email protected]>
Cc: Tomi Valkeinen <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Michael S. Tsirkin <[email protected]>
Cc: Stefan Bader <[email protected]>
Cc: Ville Syrjälä <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: Roger Pau Monné <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Luis R. Rodriguez <[email protected]>
---
drivers/media/pci/ivtv/Kconfig | 3 +++
drivers/media/pci/ivtv/ivtvfb.c | 58 ++++++++++++++++-------------------------
2 files changed, 26 insertions(+), 35 deletions(-)

diff --git a/drivers/media/pci/ivtv/Kconfig b/drivers/media/pci/ivtv/Kconfig
index dd6ee57e..b2a7f88 100644
--- a/drivers/media/pci/ivtv/Kconfig
+++ b/drivers/media/pci/ivtv/Kconfig
@@ -57,5 +57,8 @@ config VIDEO_FB_IVTV
This is used in the Hauppauge PVR-350 card. There is a driver
homepage at <http://www.ivtvdriver.org>.

+ If you have this hardware you will need to boot with PAT disabled
+ on your x86 systems, use the nopat kernel parameter.
+
To compile this driver as a module, choose M here: the
module will be called ivtvfb.
diff --git a/drivers/media/pci/ivtv/ivtvfb.c b/drivers/media/pci/ivtv/ivtvfb.c
index 9ff1230..7685ae3 100644
--- a/drivers/media/pci/ivtv/ivtvfb.c
+++ b/drivers/media/pci/ivtv/ivtvfb.c
@@ -44,8 +44,8 @@
#include <linux/ivtvfb.h>
#include <linux/slab.h>

-#ifdef CONFIG_MTRR
-#include <asm/mtrr.h>
+#ifdef CONFIG_X86_64
+#include <asm/pat.h>
#endif

#include "ivtv-driver.h"
@@ -155,12 +155,11 @@ struct osd_info {
/* Buffer size */
u32 video_buffer_size;

-#ifdef CONFIG_MTRR
/* video_base rounded down as required by hardware MTRRs */
unsigned long fb_start_aligned_physaddr;
/* video_base rounded up as required by hardware MTRRs */
unsigned long fb_end_aligned_physaddr;
-#endif
+ int wc_cookie;

/* Store the buffer offset */
int set_osd_coords_x;
@@ -1099,6 +1098,8 @@ static int ivtvfb_init_vidmode(struct ivtv *itv)
static int ivtvfb_init_io(struct ivtv *itv)
{
struct osd_info *oi = itv->osd_info;
+ /* Find the largest power of two that maps the whole buffer */
+ int size_shift = 31;

mutex_lock(&itv->serialize_lock);
if (ivtv_init_on_first_open(itv)) {
@@ -1132,29 +1133,16 @@ static int ivtvfb_init_io(struct ivtv *itv)
oi->video_pbase, oi->video_vbase,
oi->video_buffer_size / 1024);

-#ifdef CONFIG_MTRR
- {
- /* Find the largest power of two that maps the whole buffer */
- int size_shift = 31;
-
- while (!(oi->video_buffer_size & (1 << size_shift))) {
- size_shift--;
- }
- size_shift++;
- oi->fb_start_aligned_physaddr = oi->video_pbase & ~((1 << size_shift) - 1);
- oi->fb_end_aligned_physaddr = oi->video_pbase + oi->video_buffer_size;
- oi->fb_end_aligned_physaddr += (1 << size_shift) - 1;
- oi->fb_end_aligned_physaddr &= ~((1 << size_shift) - 1);
- if (mtrr_add(oi->fb_start_aligned_physaddr,
- oi->fb_end_aligned_physaddr - oi->fb_start_aligned_physaddr,
- MTRR_TYPE_WRCOMB, 1) < 0) {
- IVTVFB_INFO("disabled mttr\n");
- oi->fb_start_aligned_physaddr = 0;
- oi->fb_end_aligned_physaddr = 0;
- }
- }
-#endif
-
+ while (!(oi->video_buffer_size & (1 << size_shift)))
+ size_shift--;
+ size_shift++;
+ oi->fb_start_aligned_physaddr = oi->video_pbase & ~((1 << size_shift) - 1);
+ oi->fb_end_aligned_physaddr = oi->video_pbase + oi->video_buffer_size;
+ oi->fb_end_aligned_physaddr += (1 << size_shift) - 1;
+ oi->fb_end_aligned_physaddr &= ~((1 << size_shift) - 1);
+ oi->wc_cookie = arch_phys_wc_add(oi->fb_start_aligned_physaddr,
+ oi->fb_end_aligned_physaddr -
+ oi->fb_start_aligned_physaddr);
/* Blank the entire osd. */
memset_io(oi->video_vbase, 0, oi->video_buffer_size);

@@ -1172,14 +1160,7 @@ static void ivtvfb_release_buffers (struct ivtv *itv)

/* Release pseudo palette */
kfree(oi->ivtvfb_info.pseudo_palette);
-
-#ifdef CONFIG_MTRR
- if (oi->fb_end_aligned_physaddr) {
- mtrr_del(-1, oi->fb_start_aligned_physaddr,
- oi->fb_end_aligned_physaddr - oi->fb_start_aligned_physaddr);
- }
-#endif
-
+ arch_phys_wc_del(oi->wc_cookie);
kfree(oi);
itv->osd_info = NULL;
}
@@ -1284,6 +1265,13 @@ static int __init ivtvfb_init(void)
int registered = 0;
int err;

+#ifdef CONFIG_X86_64
+ if (WARN(pat_enabled(),
+ "ivtvfb needs PAT disabled, boot with nopat kernel parameter\n")) {
+ return EINVAL;
+ }
+#endif
+
if (ivtvfb_card_id < -1 || ivtvfb_card_id >= IVTV_MAX_CARDS) {
printk(KERN_ERR "ivtvfb: ivtvfb_card_id parameter is out of range (valid range: -1 - %d)\n",
IVTV_MAX_CARDS - 1);
--
2.3.2.209.gd67f9d5.dirty

2015-04-30 20:38:35

by Luis Chamberlain

[permalink] [raw]
Subject: [PATCH v5 5/6] IB/ipath: add counting for MTRR

From: "Luis R. Rodriguez" <[email protected]>

There is no good reason not to, we eventually delete it as well.

Cc: Toshi Kani <[email protected]>
Cc: Roland Dreier <[email protected]>
Cc: Sean Hefty <[email protected]>
Cc: Hal Rosenstock <[email protected]>
Cc: Suresh Siddha <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Dave Airlie <[email protected]>
Cc: Antonino Daplas <[email protected]>
Cc: Jean-Christophe Plagniol-Villard <[email protected]>
Cc: Tomi Valkeinen <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Luis R. Rodriguez <[email protected]>
---
drivers/infiniband/hw/ipath/ipath_wc_x86_64.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/infiniband/hw/ipath/ipath_wc_x86_64.c b/drivers/infiniband/hw/ipath/ipath_wc_x86_64.c
index 4ad0b93..70c1f3a 100644
--- a/drivers/infiniband/hw/ipath/ipath_wc_x86_64.c
+++ b/drivers/infiniband/hw/ipath/ipath_wc_x86_64.c
@@ -127,7 +127,7 @@ int ipath_enable_wc(struct ipath_devdata *dd)
"(addr %llx, len=0x%llx)\n",
(unsigned long long) pioaddr,
(unsigned long long) piolen);
- cookie = mtrr_add(pioaddr, piolen, MTRR_TYPE_WRCOMB, 0);
+ cookie = mtrr_add(pioaddr, piolen, MTRR_TYPE_WRCOMB, 1);
if (cookie < 0) {
{
dev_info(&dd->pcidev->dev,
--
2.3.2.209.gd67f9d5.dirty

2015-04-30 20:40:48

by Luis Chamberlain

[permalink] [raw]
Subject: [PATCH v5 6/6] IB/ipath: use arch_phys_wc_add() and require PAT disabled

From: "Luis R. Rodriguez" <[email protected]>

We are burrying direct access to MTRR code support on
x86 in order to take advantage of PAT. In the future we
also want to make the default behaviour of ioremap_nocache()
to use strong UC, use of mtrr_add() on those systems
would make write-combining void.

In order to help both enable us to later make strong
UC default and in order to phase out direct MTRR access
code port the driver over to arch_phys_wc_add() and
annotate that the device driver requires systems to
boot with PAT disabled, with the nopat kernel parameter.

This is a worthy compromise given that the ipath device
driver powers the old HTX bus cards that only work in
AMD systems, while the newer IB/qib device driver
powers all PCI-e cards. The ipath device driver is
obsolete, hardware hard to find and because of this
this its a reasonable compromise to make to require
users of ipath to boot with nopat.

Acked-by: Doug Ledford <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Andy Walls <[email protected]>
Cc: Hal Rosenstock <[email protected]>
Cc: Sean Hefty <[email protected]>
Cc: Suresh Siddha <[email protected]>
Cc: Rickard Strandqvist <[email protected]>
Cc: Mike Marciniszyn <[email protected]>
Cc: Roland Dreier <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: Dave Airlie <[email protected]>
Cc: Bjorn Helgaas <[email protected]>
Cc: Antonino Daplas <[email protected]>
Cc: Jean-Christophe Plagniol-Villard <[email protected]>
Cc: Tomi Valkeinen <[email protected]>
Cc: Ville Syrjälä <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Michael S. Tsirkin <[email protected]>
Cc: Stefan Bader <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: Roger Pau Monné <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Luis R. Rodriguez <[email protected]>
---
drivers/infiniband/hw/ipath/Kconfig | 3 ++
drivers/infiniband/hw/ipath/ipath_driver.c | 18 +++++++----
drivers/infiniband/hw/ipath/ipath_kernel.h | 4 +--
drivers/infiniband/hw/ipath/ipath_wc_x86_64.c | 43 ++++++---------------------
4 files changed, 26 insertions(+), 42 deletions(-)

diff --git a/drivers/infiniband/hw/ipath/Kconfig b/drivers/infiniband/hw/ipath/Kconfig
index 1d9bb11..8fe54ff 100644
--- a/drivers/infiniband/hw/ipath/Kconfig
+++ b/drivers/infiniband/hw/ipath/Kconfig
@@ -9,3 +9,6 @@ config INFINIBAND_IPATH
as IP-over-InfiniBand as well as with userspace applications
(in conjunction with InfiniBand userspace access).
For QLogic PCIe QLE based cards, use the QIB driver instead.
+
+ If you have this hardware you will need to boot with PAT disabled
+ on your x86-64 systems, use the nopat kernel parameter.
diff --git a/drivers/infiniband/hw/ipath/ipath_driver.c b/drivers/infiniband/hw/ipath/ipath_driver.c
index bd0caed..441cfe5 100644
--- a/drivers/infiniband/hw/ipath/ipath_driver.c
+++ b/drivers/infiniband/hw/ipath/ipath_driver.c
@@ -42,6 +42,9 @@
#include <linux/bitmap.h>
#include <linux/slab.h>
#include <linux/module.h>
+#ifdef CONFIG_X86_64
+#include <asm/pat.h>
+#endif

#include "ipath_kernel.h"
#include "ipath_verbs.h"
@@ -395,6 +398,14 @@ static int ipath_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
unsigned long long addr;
u32 bar0 = 0, bar1 = 0;

+#ifdef CONFIG_X86_64
+ if (WARN(pat_enabled(),
+ "ipath needs PAT disabled, boot with nopat kernel parameter\n")) {
+ ret = EINVAL;
+ goto bail;
+ }
+#endif
+
dd = ipath_alloc_devdata(pdev);
if (IS_ERR(dd)) {
ret = PTR_ERR(dd);
@@ -542,6 +553,7 @@ static int ipath_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
dd->ipath_kregbase = __ioremap(addr, len,
(_PAGE_NO_CACHE|_PAGE_WRITETHRU));
#else
+ /* XXX: split this properly to enable on PAT */
dd->ipath_kregbase = ioremap_nocache(addr, len);
#endif

@@ -587,12 +599,8 @@ static int ipath_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)

ret = ipath_enable_wc(dd);

- if (ret) {
- ipath_dev_err(dd, "Write combining not enabled "
- "(err %d): performance may be poor\n",
- -ret);
+ if (ret)
ret = 0;
- }

ipath_verify_pioperf(dd);

diff --git a/drivers/infiniband/hw/ipath/ipath_kernel.h b/drivers/infiniband/hw/ipath/ipath_kernel.h
index e08db70..f0f9471 100644
--- a/drivers/infiniband/hw/ipath/ipath_kernel.h
+++ b/drivers/infiniband/hw/ipath/ipath_kernel.h
@@ -463,9 +463,7 @@ struct ipath_devdata {
/* offset in HT config space of slave/primary interface block */
u8 ipath_ht_slave_off;
/* for write combining settings */
- unsigned long ipath_wc_cookie;
- unsigned long ipath_wc_base;
- unsigned long ipath_wc_len;
+ int wc_cookie;
/* ref count for each pkey */
atomic_t ipath_pkeyrefs[4];
/* shadow copy of struct page *'s for exp tid pages */
diff --git a/drivers/infiniband/hw/ipath/ipath_wc_x86_64.c b/drivers/infiniband/hw/ipath/ipath_wc_x86_64.c
index 70c1f3a..7b6e4c8 100644
--- a/drivers/infiniband/hw/ipath/ipath_wc_x86_64.c
+++ b/drivers/infiniband/hw/ipath/ipath_wc_x86_64.c
@@ -37,7 +37,6 @@
*/

#include <linux/pci.h>
-#include <asm/mtrr.h>
#include <asm/processor.h>

#include "ipath_kernel.h"
@@ -122,27 +121,14 @@ int ipath_enable_wc(struct ipath_devdata *dd)
}

if (!ret) {
- int cookie;
- ipath_cdbg(VERBOSE, "Setting mtrr for chip to WC "
- "(addr %llx, len=0x%llx)\n",
- (unsigned long long) pioaddr,
- (unsigned long long) piolen);
- cookie = mtrr_add(pioaddr, piolen, MTRR_TYPE_WRCOMB, 1);
- if (cookie < 0) {
- {
- dev_info(&dd->pcidev->dev,
- "mtrr_add() WC for PIO bufs "
- "failed (%d)\n",
- cookie);
- ret = -EINVAL;
- }
- } else {
- ipath_cdbg(VERBOSE, "Set mtrr for chip to WC, "
- "cookie is %d\n", cookie);
- dd->ipath_wc_cookie = cookie;
- dd->ipath_wc_base = (unsigned long) pioaddr;
- dd->ipath_wc_len = (unsigned long) piolen;
- }
+ dd->wc_cookie = arch_phys_wc_add(pioaddr, piolen);
+ if (dd->wc_cookie < 0) {
+ ipath_dev_err(dd, "Seting mtrr failed on PIO buffers\n");
+ ret = -ENODEV;
+ } else if (dd->wc_cookie == 0)
+ ipath_cdbg(VERBOSE, "Set mtrr for chip to WC not needed\n");
+ else
+ ipath_cdbg(VERBOSE, "Set mtrr for chip to WC\n");
}

return ret;
@@ -154,16 +140,5 @@ int ipath_enable_wc(struct ipath_devdata *dd)
*/
void ipath_disable_wc(struct ipath_devdata *dd)
{
- if (dd->ipath_wc_cookie) {
- int r;
- ipath_cdbg(VERBOSE, "undoing WCCOMB on pio buffers\n");
- r = mtrr_del(dd->ipath_wc_cookie, dd->ipath_wc_base,
- dd->ipath_wc_len);
- if (r < 0)
- dev_info(&dd->pcidev->dev,
- "mtrr_del(%lx, %lx, %lx) failed: %d\n",
- dd->ipath_wc_cookie, dd->ipath_wc_base,
- dd->ipath_wc_len, r);
- dd->ipath_wc_cookie = 0; /* even on failure */
- }
+ arch_phys_wc_del(dd->wc_cookie);
}
--
2.3.2.209.gd67f9d5.dirty

2015-05-04 14:58:37

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v5 1/6] x86/mm/pat: use pr_info() and friends

On Thu, Apr 30, 2015 at 01:25:15PM -0700, Luis R. Rodriguez wrote:
> From: "Luis R. Rodriguez" <[email protected]>
>
> Use pr_info() instead of the old printk to
> prefix the component where things are coming
> from. With this readers will know exactly where
> the message is coming from. Since pr_fmt is
> already defined in this case we redefine it to
> "PAT: ".
>
> We leave the users of dprintk() in place, this
> will print only when the debugpat kernel parameter
> is enabled. We want to leave those enabled as a
> debug feature, but also make them use the same
> prefix.
>
> Cc: Andy Walls <[email protected]>
> Cc: Doug Ledford <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Cc: Juergen Gross <[email protected]>
> Cc: Daniel Vetter <[email protected]>
> Cc: Dave Airlie <[email protected]>
> Cc: Bjorn Helgaas <[email protected]>
> Cc: Borislav Petkov <[email protected]>
> Cc: Michael S. Tsirkin <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Signed-off-by: Luis R. Rodriguez <[email protected]>
> ---
> arch/x86/mm/pat.c | 41 +++++++++++++++++++++--------------------
> arch/x86/mm/pat_internal.h | 2 +-
> arch/x86/mm/pat_rbtree.c | 5 ++++-
> 3 files changed, 26 insertions(+), 22 deletions(-)
>
> diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
> index 372ad42..8f88c6a 100644
> --- a/arch/x86/mm/pat.c
> +++ b/arch/x86/mm/pat.c
> @@ -33,13 +33,16 @@
> #include "pat_internal.h"
> #include "mm_internal.h"
>
> +#undef pr_fmt
> +#define pr_fmt(fmt) "PAT: " fmt

Hmm, ok, so those pr_* helpers with the prefix actually make grepping
for the error message not fun. So I take that back about the pr_fmt
thing - it is a bad idea.

Instead, we should still use pr_* because they're shorter but simply add
the prefix before each message so that you can grep for it.

I went and I did that, see below.

---
From: "Luis R. Rodriguez" <[email protected]>
Date: Thu, 30 Apr 2015 13:25:15 -0700
Subject: [PATCH] x86/mm/pat: Convert to pr_* usage

Use pr_info() instead of the old printk to prefix the component where
things are coming from. With this readers will know exactly where the
message is coming from. Since pr_fmt is already defined in this case we
redefine it to "x86/PAT: ".

We leave the users of dprintk() in place, this will print only when the
debugpat kernel parameter is enabled. We want to leave those enabled as
a debug feature, but also make them use the same prefix.

Signed-off-by: Luis R. Rodriguez <[email protected]>
Cc: Andy Walls <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: Dave Airlie <[email protected]>
Cc: Bjorn Helgaas <[email protected]>
Cc: Michael S. Tsirkin <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by:
---
arch/x86/mm/pat.c | 41 +++++++++++++++++++----------------------
arch/x86/mm/pat_internal.h | 2 +-
arch/x86/mm/pat_rbtree.c | 2 +-
3 files changed, 21 insertions(+), 24 deletions(-)

diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index 35af6771a95a..8269c784b61c 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -39,7 +39,7 @@ int __read_mostly pat_enabled = 1;
static inline void pat_disable(const char *reason)
{
pat_enabled = 0;
- printk(KERN_INFO "%s\n", reason);
+ pr_info("x86/PAT: %s\n", reason);
}

static int __init nopat(char *str)
@@ -188,7 +188,7 @@ void pat_init_cache_modes(void)
pat_msg + 4 * i);
update_cache_mode_entry(i, cache);
}
- pr_info("PAT configuration [0-7]: %s\n", pat_msg);
+ pr_info("x86/PAT: Configuration [0-7]: %s\n", pat_msg);
}

#define PAT(x, y) ((u64)PAT_ ## y << ((x)*8))
@@ -211,8 +211,7 @@ void pat_init(void)
* switched to PAT on the boot CPU. We have no way to
* undo PAT.
*/
- printk(KERN_ERR "PAT enabled, "
- "but not supported by secondary CPU\n");
+ pr_err("x86/PAT: PAT enabled, but not supported by secondary CPU\n");
BUG();
}
}
@@ -347,7 +346,7 @@ static int reserve_ram_pages_type(u64 start, u64 end,
page = pfn_to_page(pfn);
type = get_page_memtype(page);
if (type != -1) {
- pr_info("reserve_ram_pages_type failed [mem %#010Lx-%#010Lx], track 0x%x, req 0x%x\n",
+ pr_info("x86/PAT: reserve_ram_pages_type failed [mem %#010Lx-%#010Lx], track 0x%x, req 0x%x\n",
start, end - 1, type, req_type);
if (new_type)
*new_type = type;
@@ -451,9 +450,9 @@ int reserve_memtype(u64 start, u64 end, enum page_cache_mode req_type,

err = rbt_memtype_check_insert(new, new_type);
if (err) {
- printk(KERN_INFO "reserve_memtype failed [mem %#010Lx-%#010Lx], track %s, req %s\n",
- start, end - 1,
- cattr_name(new->type), cattr_name(req_type));
+ pr_info("x86/PAT: reserve_memtype failed [mem %#010Lx-%#010Lx], track %s, req %s\n",
+ start, end - 1,
+ cattr_name(new->type), cattr_name(req_type));
kfree(new);
spin_unlock(&memtype_lock);

@@ -497,8 +496,8 @@ int free_memtype(u64 start, u64 end)
spin_unlock(&memtype_lock);

if (!entry) {
- printk(KERN_INFO "%s:%d freeing invalid memtype [mem %#010Lx-%#010Lx]\n",
- current->comm, current->pid, start, end - 1);
+ pr_info("x86/PAT: %s:%d freeing invalid memtype [mem %#010Lx-%#010Lx]\n",
+ current->comm, current->pid, start, end - 1);
return -EINVAL;
}

@@ -628,8 +627,8 @@ static inline int range_is_allowed(unsigned long pfn, unsigned long size)

while (cursor < to) {
if (!devmem_is_allowed(pfn)) {
- printk(KERN_INFO "Program %s tried to access /dev/mem between [mem %#010Lx-%#010Lx], PAT prevents it\n",
- current->comm, from, to - 1);
+ pr_info("x86/PAT: Program %s tried to access /dev/mem between [mem %#010Lx-%#010Lx], PAT prevents it\n",
+ current->comm, from, to - 1);
return 0;
}
cursor += PAGE_SIZE;
@@ -698,8 +697,7 @@ int kernel_map_sync_memtype(u64 base, unsigned long size,
size;

if (ioremap_change_attr((unsigned long)__va(base), id_sz, pcm) < 0) {
- printk(KERN_INFO "%s:%d ioremap_change_attr failed %s "
- "for [mem %#010Lx-%#010Lx]\n",
+ pr_info("x86/PAT: %s:%d ioremap_change_attr failed %s for [mem %#010Lx-%#010Lx]\n",
current->comm, current->pid,
cattr_name(pcm),
base, (unsigned long long)(base + size-1));
@@ -734,7 +732,7 @@ static int reserve_pfn_range(u64 paddr, unsigned long size, pgprot_t *vma_prot,

pcm = lookup_memtype(paddr);
if (want_pcm != pcm) {
- printk(KERN_WARNING "%s:%d map pfn RAM range req %s for [mem %#010Lx-%#010Lx], got %s\n",
+ pr_warn("x86/PAT: %s:%d map pfn RAM range req %s for [mem %#010Lx-%#010Lx], got %s\n",
current->comm, current->pid,
cattr_name(want_pcm),
(unsigned long long)paddr,
@@ -755,13 +753,12 @@ static int reserve_pfn_range(u64 paddr, unsigned long size, pgprot_t *vma_prot,
if (strict_prot ||
!is_new_memtype_allowed(paddr, size, want_pcm, pcm)) {
free_memtype(paddr, paddr + size);
- printk(KERN_ERR "%s:%d map pfn expected mapping type %s"
- " for [mem %#010Lx-%#010Lx], got %s\n",
- current->comm, current->pid,
- cattr_name(want_pcm),
- (unsigned long long)paddr,
- (unsigned long long)(paddr + size - 1),
- cattr_name(pcm));
+ pr_err("x86/PAT: %s:%d map pfn expected mapping type %s for [mem %#010Lx-%#010Lx], got %s\n",
+ current->comm, current->pid,
+ cattr_name(want_pcm),
+ (unsigned long long)paddr,
+ (unsigned long long)(paddr + size - 1),
+ cattr_name(pcm));
return -EINVAL;
}
/*
diff --git a/arch/x86/mm/pat_internal.h b/arch/x86/mm/pat_internal.h
index f6411620305d..a739bfc40690 100644
--- a/arch/x86/mm/pat_internal.h
+++ b/arch/x86/mm/pat_internal.h
@@ -4,7 +4,7 @@
extern int pat_debug_enable;

#define dprintk(fmt, arg...) \
- do { if (pat_debug_enable) printk(KERN_INFO fmt, ##arg); } while (0)
+ do { if (pat_debug_enable) pr_info("x86/PAT: " fmt, ##arg); } while (0)

struct memtype {
u64 start;
diff --git a/arch/x86/mm/pat_rbtree.c b/arch/x86/mm/pat_rbtree.c
index 6582adcc8bd9..82b8c6aaf260 100644
--- a/arch/x86/mm/pat_rbtree.c
+++ b/arch/x86/mm/pat_rbtree.c
@@ -160,7 +160,7 @@ success:
return 0;

failure:
- printk(KERN_INFO "%s:%d conflicting memory types "
+ pr_info("x86/PAT: %s:%d conflicting memory types "
"%Lx-%Lx %s<->%s\n", current->comm, current->pid, start,
end, cattr_name(found_type), cattr_name(match->type));
return -EBUSY;
--
2.3.5

--
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--

2015-05-04 15:22:45

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v5 2/6] x86/mm/pat: redefine pat_enabled

On Thu, Apr 30, 2015 at 01:25:16PM -0700, Luis R. Rodriguez wrote:
> diff --git a/arch/x86/kernel/cpu/mtrr/main.c b/arch/x86/kernel/cpu/mtrr/main.c
> index bfef424..f094d36 100644
> --- a/arch/x86/kernel/cpu/mtrr/main.c
> +++ b/arch/x86/kernel/cpu/mtrr/main.c
> @@ -556,7 +556,7 @@ int arch_phys_wc_add(unsigned long base, unsigned long size)
> {
> int ret;
>
> - if (pat_enabled || !mtrr_enabled)
> + if (pat_enabled() || !mtrr_enabled)

What's going on here? I got a reject about mtrr_enabled which is nowhere
to be found. Am I missing a patch?

Anyway, I applied that:

---
From: "Luis R. Rodriguez" <[email protected]>
Date: Thu, 30 Apr 2015 13:25:16 -0700
Subject: [PATCH] x86/mm/pat: Redefine pat_enabled

We use pat_enabled in x86 specific code to see if PAT is enabled or
not, we however are granting full access to the variable even though
readers do not need to set it. If for instance we granted access to it
to modules later they then could override the variable setting... no
bueno.

This renames pat_enabled to a new static variable __pat_enabled. To
see if PAT is enabled / disabled folks can just use the nice accessor
pat_enabled() now.

Code that sets this can only be internal to pat.c. Apart from the early
kernel parameter "nopat" to disable PAT we also have a few cases that
disable it later and make use of a helper pat_disable(), this helper is
wrapped under an ifdef but since that code cannot run unless PAT was
enabled its not required to wrap it with ifdefs, unwrap that. Likewise
since "nopat" doesn't really change non-PAT systems just remove that
ifdef as well.

Although we could add and use an early_param_off() these helpers don't
use __read_mostly and we want to keep __read_mostly for __pat_enabled as
this is a hot path -- upon boot for instance a simple guest may see ~4k
accesses to pat_enabled(). Since __read_mostly early boot params are not
that common we don't add a helper for them just yet.

Signed-off-by: Luis R. Rodriguez <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Kyle McMartin <[email protected]>
Cc: Andy Walls <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: Dave Airlie <[email protected]>
Cc: Bjorn Helgaas <[email protected]>
Cc: Michael S. Tsirkin <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by:
---
arch/x86/include/asm/pat.h | 7 +------
arch/x86/kernel/cpu/mtrr/main.c | 2 +-
arch/x86/mm/iomap_32.c | 2 +-
arch/x86/mm/ioremap.c | 4 ++--
arch/x86/mm/pageattr.c | 2 +-
arch/x86/mm/pat.c | 33 +++++++++++++++------------------
arch/x86/pci/i386.c | 6 +++---
7 files changed, 24 insertions(+), 32 deletions(-)

diff --git a/arch/x86/include/asm/pat.h b/arch/x86/include/asm/pat.h
index 91bc4ba95f91..cdcff7f7f694 100644
--- a/arch/x86/include/asm/pat.h
+++ b/arch/x86/include/asm/pat.h
@@ -4,12 +4,7 @@
#include <linux/types.h>
#include <asm/pgtable_types.h>

-#ifdef CONFIG_X86_PAT
-extern int pat_enabled;
-#else
-static const int pat_enabled;
-#endif
-
+bool pat_enabled(void);
extern void pat_init(void);
void pat_init_cache_modes(void);

diff --git a/arch/x86/kernel/cpu/mtrr/main.c b/arch/x86/kernel/cpu/mtrr/main.c
index ea5f363a1948..96fa7b38af5e 100644
--- a/arch/x86/kernel/cpu/mtrr/main.c
+++ b/arch/x86/kernel/cpu/mtrr/main.c
@@ -545,7 +545,7 @@ int arch_phys_wc_add(unsigned long base, unsigned long size)
{
int ret;

- if (pat_enabled)
+ if (pat_enabled())
return 0; /* Success! (We don't need to do anything.) */

ret = mtrr_add(base, size, MTRR_TYPE_WRCOMB, true);
diff --git a/arch/x86/mm/iomap_32.c b/arch/x86/mm/iomap_32.c
index 9ca35fc60cfe..3a2ec8790ca7 100644
--- a/arch/x86/mm/iomap_32.c
+++ b/arch/x86/mm/iomap_32.c
@@ -82,7 +82,7 @@ iomap_atomic_prot_pfn(unsigned long pfn, pgprot_t prot)
* MTRR is UC or WC. UC_MINUS gets the real intention, of the
* user, which is "WC if the MTRR is WC, UC if you can't do that."
*/
- if (!pat_enabled && pgprot_val(prot) ==
+ if (!pat_enabled() && pgprot_val(prot) ==
(__PAGE_KERNEL | cachemode2protval(_PAGE_CACHE_MODE_WC)))
prot = __pgprot(__PAGE_KERNEL |
cachemode2protval(_PAGE_CACHE_MODE_UC_MINUS));
diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index fc08431a387b..ea379c06cc4c 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -234,7 +234,7 @@ void __iomem *ioremap_nocache(resource_size_t phys_addr, unsigned long size)
{
/*
* Ideally, this should be:
- * pat_enabled ? _PAGE_CACHE_MODE_UC : _PAGE_CACHE_MODE_UC_MINUS;
+ * pat_enabled() ? _PAGE_CACHE_MODE_UC : _PAGE_CACHE_MODE_UC_MINUS;
*
* Till we fix all X drivers to use ioremap_wc(), we will use
* UC MINUS. Drivers that are certain they need or can already
@@ -292,7 +292,7 @@ EXPORT_SYMBOL_GPL(ioremap_uc);
*/
void __iomem *ioremap_wc(resource_size_t phys_addr, unsigned long size)
{
- if (pat_enabled)
+ if (pat_enabled())
return __ioremap_caller(phys_addr, size, _PAGE_CACHE_MODE_WC,
__builtin_return_address(0));
else
diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index d35148acdc05..e07686633ce4 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -1573,7 +1573,7 @@ int set_memory_wc(unsigned long addr, int numpages)
{
int ret;

- if (!pat_enabled)
+ if (!pat_enabled())
return set_memory_uc(addr, numpages);

ret = reserve_memtype(__pa(addr), __pa(addr) + numpages * PAGE_SIZE,
diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index 8269c784b61c..6e05a071ffc8 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -33,12 +33,11 @@
#include "pat_internal.h"
#include "mm_internal.h"

-#ifdef CONFIG_X86_PAT
-int __read_mostly pat_enabled = 1;
+static int __read_mostly __pat_enabled = IS_ENABLED(CONFIG_X86_PAT);

static inline void pat_disable(const char *reason)
{
- pat_enabled = 0;
+ __pat_enabled = 0;
pr_info("x86/PAT: %s\n", reason);
}

@@ -48,13 +47,11 @@ static int __init nopat(char *str)
return 0;
}
early_param("nopat", nopat);
-#else
-static inline void pat_disable(const char *reason)
+
+bool pat_enabled(void)
{
- (void)reason;
+ return !!__pat_enabled;
}
-#endif
-

int pat_debug_enable;

@@ -198,7 +195,7 @@ void pat_init(void)
u64 pat;
bool boot_cpu = !boot_pat_state;

- if (!pat_enabled)
+ if (!pat_enabled())
return;

if (!cpu_has_pat) {
@@ -399,7 +396,7 @@ int reserve_memtype(u64 start, u64 end, enum page_cache_mode req_type,

BUG_ON(start >= end); /* end is exclusive */

- if (!pat_enabled) {
+ if (!pat_enabled()) {
/* This is identical to page table setting without PAT */
if (new_type) {
if (req_type == _PAGE_CACHE_MODE_WC)
@@ -474,7 +471,7 @@ int free_memtype(u64 start, u64 end)
int is_range_ram;
struct memtype *entry;

- if (!pat_enabled)
+ if (!pat_enabled())
return 0;

/* Low ISA region is always mapped WB. No need to track */
@@ -622,7 +619,7 @@ static inline int range_is_allowed(unsigned long pfn, unsigned long size)
u64 to = from + size;
u64 cursor = from;

- if (!pat_enabled)
+ if (!pat_enabled())
return 1;

while (cursor < to) {
@@ -658,7 +655,7 @@ int phys_mem_access_prot_allowed(struct file *file, unsigned long pfn,
* caching for the high addresses through the KEN pin, but
* we maintain the tradition of paranoia in this code.
*/
- if (!pat_enabled &&
+ if (!pat_enabled() &&
!(boot_cpu_has(X86_FEATURE_MTRR) ||
boot_cpu_has(X86_FEATURE_K6_MTRR) ||
boot_cpu_has(X86_FEATURE_CYRIX_ARR) ||
@@ -727,7 +724,7 @@ static int reserve_pfn_range(u64 paddr, unsigned long size, pgprot_t *vma_prot,
* the type requested matches the type of first page in the range.
*/
if (is_ram) {
- if (!pat_enabled)
+ if (!pat_enabled())
return 0;

pcm = lookup_memtype(paddr);
@@ -841,7 +838,7 @@ int track_pfn_remap(struct vm_area_struct *vma, pgprot_t *prot,
return ret;
}

- if (!pat_enabled)
+ if (!pat_enabled())
return 0;

/*
@@ -869,7 +866,7 @@ int track_pfn_insert(struct vm_area_struct *vma, pgprot_t *prot,
{
enum page_cache_mode pcm;

- if (!pat_enabled)
+ if (!pat_enabled())
return 0;

/* Set prot based on lookup */
@@ -910,7 +907,7 @@ void untrack_pfn(struct vm_area_struct *vma, unsigned long pfn,

pgprot_t pgprot_writecombine(pgprot_t prot)
{
- if (pat_enabled)
+ if (pat_enabled())
return __pgprot(pgprot_val(prot) |
cachemode2protval(_PAGE_CACHE_MODE_WC));
else
@@ -993,7 +990,7 @@ static const struct file_operations memtype_fops = {

static int __init pat_memtype_list_init(void)
{
- if (pat_enabled) {
+ if (pat_enabled()) {
debugfs_create_file("pat_memtype_list", S_IRUSR,
arch_debugfs_dir, NULL, &memtype_fops);
}
diff --git a/arch/x86/pci/i386.c b/arch/x86/pci/i386.c
index 349c0d32cc0b..0a9f2caf358f 100644
--- a/arch/x86/pci/i386.c
+++ b/arch/x86/pci/i386.c
@@ -429,12 +429,12 @@ int pci_mmap_page_range(struct pci_dev *dev, struct vm_area_struct *vma,
* Caller can followup with UC MINUS request and add a WC mtrr if there
* is a free mtrr slot.
*/
- if (!pat_enabled && write_combine)
+ if (!pat_enabled() && write_combine)
return -EINVAL;

- if (pat_enabled && write_combine)
+ if (pat_enabled() && write_combine)
prot |= cachemode2protval(_PAGE_CACHE_MODE_WC);
- else if (pat_enabled || boot_cpu_data.x86 > 3)
+ else if (pat_enabled() || boot_cpu_data.x86 > 3)
/*
* ioremap() and ioremap_nocache() defaults to UC MINUS for now.
* To avoid attribute conflicts, request UC MINUS here
--
2.3.5

--
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--

2015-05-04 15:29:38

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v5 3/6] arch/x86/mm/pat: export pat_enabled()

On Thu, Apr 30, 2015 at 01:25:17PM -0700, Luis R. Rodriguez wrote:
> From: "Luis R. Rodriguez" <[email protected]>
>
> Two Linux device drivers cannot work with PAT and the work
> required to make them work is significant. There is not
> enough motivation to convert these drivers over to use
> PAT properly, the compromise reached is to let drivers
> that cannot be ported to PAT check if PAT was enabled
> and if so fail on probe with a recommendation to boot
> with the "nopat" kernel parameter.
>
> Cc: Andy Walls <[email protected]>
> Cc: Doug Ledford <[email protected]>
> Cc: Andy Lutomirski <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Cc: Juergen Gross <[email protected]>
> Cc: Daniel Vetter <[email protected]>
> Cc: Dave Airlie <[email protected]>
> Cc: Bjorn Helgaas <[email protected]>
> Cc: Borislav Petkov <[email protected]>
> Cc: Michael S. Tsirkin <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Signed-off-by: Luis R. Rodriguez <[email protected]>
> ---
> arch/x86/mm/pat.c | 1 +
> 1 file changed, 1 insertion(+)

Applied, thanks.

--
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--

2015-05-05 00:42:35

by Luis Chamberlain

[permalink] [raw]
Subject: Re: [PATCH v5 2/6] x86/mm/pat: redefine pat_enabled

On Mon, May 04, 2015 at 05:22:08PM +0200, Borislav Petkov wrote:
> On Thu, Apr 30, 2015 at 01:25:16PM -0700, Luis R. Rodriguez wrote:
> > diff --git a/arch/x86/kernel/cpu/mtrr/main.c b/arch/x86/kernel/cpu/mtrr/main.c
> > index bfef424..f094d36 100644
> > --- a/arch/x86/kernel/cpu/mtrr/main.c
> > +++ b/arch/x86/kernel/cpu/mtrr/main.c
> > @@ -556,7 +556,7 @@ int arch_phys_wc_add(unsigned long base, unsigned long size)
> > {
> > int ret;
> >
> > - if (pat_enabled || !mtrr_enabled)
> > + if (pat_enabled() || !mtrr_enabled)
>
> What's going on here? I got a reject about mtrr_enabled which is nowhere
> to be found. Am I missing a patch?

Yes, the patch titled, "x86: mtrr: generalize run time disabling of MTRR"
should be applied first, or ammended to fit the new style. Let me know what
you prefer.

> Anyway, I applied that:

Great, thanks, is there a tree I can use to rebase / fetch ?

Luis

Subject: RE: [PATCH v5 1/6] x86/mm/pat: use pr_info() and friends

> From: [email protected] [mailto:linux-kernel-
> [email protected]] On Behalf Of Luis R. Rodriguez
> Sent: Thursday, April 30, 2015 3:25 PM
> Subject: [PATCH v5 1/6] x86/mm/pat: use pr_info() and friends
>
...
> - printk(KERN_ERR "%s:%d map pfn expected mapping
> type %s"
> - " for [mem %#010Lx-%#010Lx], got %s\n",
> - current->comm, current->pid,
> - cattr_name(want_pcm),
> - (unsigned long long)paddr,
> - (unsigned long long)(paddr + size - 1),
> - cattr_name(pcm));
> + pr_err("%s:%d map pfn expected mapping type %s"
> + " for [mem %#010Lx-%#010Lx], got %s\n",

Since the patch joins some other print format strings split across
lines (which checkpatch allows), you might want to join this one too.

...
> diff --git a/arch/x86/mm/pat_rbtree.c b/arch/x86/mm/pat_rbtree.c
...
> failure:
> - printk(KERN_INFO "%s:%d conflicting memory types "
> + pr_info("%s:%d conflicting memory types "
> "%Lx-%Lx %s<->%s\n", current->comm, current->pid, start,
> end, cattr_name(found_type), cattr_name(match->type));

and that one.

2015-05-14 15:55:57

by Luis Chamberlain

[permalink] [raw]
Subject: Re: [PATCH v5 1/6] x86/mm/pat: use pr_info() and friends

On Thu, May 07, 2015 at 03:36:15AM +0000, Elliott, Robert (Server Storage) wrote:
> > From: [email protected] [mailto:linux-kernel-
> > [email protected]] On Behalf Of Luis R. Rodriguez
> > Sent: Thursday, April 30, 2015 3:25 PM
> > Subject: [PATCH v5 1/6] x86/mm/pat: use pr_info() and friends
> >
> ...
> > - printk(KERN_ERR "%s:%d map pfn expected mapping
> > type %s"
> > - " for [mem %#010Lx-%#010Lx], got %s\n",
> > - current->comm, current->pid,
> > - cattr_name(want_pcm),
> > - (unsigned long long)paddr,
> > - (unsigned long long)(paddr + size - 1),
> > - cattr_name(pcm));
> > + pr_err("%s:%d map pfn expected mapping type %s"
> > + " for [mem %#010Lx-%#010Lx], got %s\n",
>
> Since the patch joins some other print format strings split across
> lines (which checkpatch allows), you might want to join this one too.
>
> ...
> > diff --git a/arch/x86/mm/pat_rbtree.c b/arch/x86/mm/pat_rbtree.c
> ...
> > failure:
> > - printk(KERN_INFO "%s:%d conflicting memory types "
> > + pr_info("%s:%d conflicting memory types "
> > "%Lx-%Lx %s<->%s\n", current->comm, current->pid, start,
> > end, cattr_name(found_type), cattr_name(match->type));
>
> and that one.

I have adjusted this.

Boris, would you like a v6 re-spin on this series? Or just this patch, or
anthing else? FWIW since I keep having to re-do patches / rebase after a while
and the entire kill-mtrr series is large with tons of parts I've set out a tree
with all pending mtrr changes. The kill-mtrr-v5-20150514 can be used:

https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux-next.git

When needed I'll just fetch linux-next and rebase --onto that day's
origin/master.

Luis