2015-05-15 18:43:20

by Toshi Kani

[permalink] [raw]
Subject: [PATCH v5 0/6] mtrr, mm, x86: Enhance MTRR checks for huge I/O mapping

This patchset enhances MTRR checks for the kernel huge I/O mapping.

The following functional changes are made in patch 6/6.
- Allow pud_set_huge() and pmd_set_huge() to create a huge page mapping
when the range is covered by a single MTRR entry of any memory type.
- Log a pr_warn_once() message when a specified PMD map range spans more
than a single MTRR entry. Drivers should make a mapping request aligned
to a single MTRR entry when the range is covered by MTRRs.

Patch 1/6 simplifies the condition of HAVE_ARCH_HUGE_VMAP in Kconfig.
Patch 2/6 - 5/6 are bug fix and clean up to mtrr_type_lookup().

The patchset is based on the tip tree.
---
v5:
- Separate Kconfig change and reordered/squashed the patchset. (Borislav
Petkov)
- Update logs, comments and functional structures. (Borislav Petkov)
- Move MTRR_STATE_MTRR_XXX definitions to kernel asm/mtrr.h. (Borislav
Petkov)
- Change mtrr_type_lookup() not to set 'uniform' in case of MTRR_TYPE_INVALID.
(Borislav Petkov)
- Remove a patch accepted in the tip free from the series.

v4:
- Update the change logs of patchset. (Ingo Molnar)
- Add patch 3/7 to make the wrong address fix as a separate patch.
(Ingo Molnar)
- Add patch 5/7 to define MTRR_TYPE_INVALID. (Ingo Molnar)
- Update patch 6/7 to document MTRR fixed ranges. (Ingo Molnar)

v3:
- Add patch 3/5 to fix a bug in MTRR state checks.
- Update patch 4/5 to create separate functions for the fixed and
variable entries. (Ingo Molnar)

v2:
- Update change logs and comments per review comments.
(Ingo Molnar)
- Add patch 3/4 to clean up mtrr_type_lookup(). (Ingo Molnar)

---
Toshi Kani (6):
1/6 mm, x86: Simplify conditions of HAVE_ARCH_HUGE_VMAP
2/6 mtrr, x86: Fix MTRR lookup to handle inclusive entry
3/6 mtrr, x86: Fix MTRR state checks in mtrr_type_lookup()
4/6 mtrr, x86: Define MTRR_TYPE_INVALID for mtrr_type_lookup()
5/6 mtrr, x86: Clean up mtrr_type_lookup()
6/6 mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping

---
arch/x86/Kconfig | 2 +-
arch/x86/include/asm/mtrr.h | 10 +-
arch/x86/include/uapi/asm/mtrr.h | 8 +-
arch/x86/kernel/cpu/mtrr/cleanup.c | 3 +-
arch/x86/kernel/cpu/mtrr/generic.c | 200 ++++++++++++++++++++++++-------------
arch/x86/mm/pat.c | 4 +-
arch/x86/mm/pgtable.c | 59 ++++++++---
7 files changed, 194 insertions(+), 92 deletions(-)


2015-05-15 18:43:26

by Toshi Kani

[permalink] [raw]
Subject: [PATCH v5 1/6] mm, x86: Simplify conditions of HAVE_ARCH_HUGE_VMAP

Simplify the conditions to select HAVE_ARCH_HUGE_VMAP
in arch/x86/Kconfig since X86_PAE depends on X86_32.

Signed-off-by: Toshi Kani <[email protected]>
---
arch/x86/Kconfig | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 8fec044..73a4d03 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -100,7 +100,7 @@ config X86
select IRQ_FORCED_THREADING
select HAVE_BPF_JIT if X86_64
select HAVE_ARCH_TRANSPARENT_HUGEPAGE
- select HAVE_ARCH_HUGE_VMAP if X86_64 || (X86_32 && X86_PAE)
+ select HAVE_ARCH_HUGE_VMAP if X86_64 || X86_PAE
select ARCH_HAS_SG_CHAIN
select CLKEVT_I8253
select ARCH_HAVE_NMI_SAFE_CMPXCHG

2015-05-15 18:45:01

by Toshi Kani

[permalink] [raw]
Subject: [PATCH v5 2/6] mtrr, x86: Fix MTRR lookup to handle inclusive entry

When an MTRR entry is inclusive to a requested range, i.e.
the start and end of the request are not within the MTRR
entry range but the range contains the MTRR entry entirely,
__mtrr_type_lookup() ignores such a case because both
start_state and end_state are set to zero.

This bug can cause the following issues:
1) reserve_memtype() tracks an effective memory type in case
a request type is WB (ex. /dev/mem blindly uses WB). Missing
to track with its effective type causes a subsequent request
to map the same range with the effective type to fail.
2) pud_set_huge() and pmd_set_huge() check if a requested range
has any overlap with MTRRs. Missing to detect an overlap may
cause a performance penalty or undefined behavior.

This patch fixes the bug by adding a new flag, 'inclusive',
to detect the inclusive case. This case is then handled in
the same way as end_state:1 since the first region is the same.
With this fix, __mtrr_type_lookup() handles the inclusive case
properly.

Signed-off-by: Toshi Kani <[email protected]>
---
arch/x86/kernel/cpu/mtrr/generic.c | 28 ++++++++++++++++++----------
1 file changed, 18 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c
index 5b23967..e202d26 100644
--- a/arch/x86/kernel/cpu/mtrr/generic.c
+++ b/arch/x86/kernel/cpu/mtrr/generic.c
@@ -154,7 +154,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)

prev_match = 0xFF;
for (i = 0; i < num_var_ranges; ++i) {
- unsigned short start_state, end_state;
+ unsigned short start_state, end_state, inclusive;

if (!(mtrr_state.var_ranges[i].mask_lo & (1 << 11)))
continue;
@@ -166,19 +166,27 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)

start_state = ((start & mask) == (base & mask));
end_state = ((end & mask) == (base & mask));
+ inclusive = ((start < base) && (end > base));

- if (start_state != end_state) {
+ if ((start_state != end_state) || inclusive) {
/*
* We have start:end spanning across an MTRR.
- * We split the region into
- * either
- * (start:mtrr_end) (mtrr_end:end)
- * or
- * (start:mtrr_start) (mtrr_start:end)
+ * We split the region into either
+ *
+ * - start_state:1
+ * (start:mtrr_end)(mtrr_end:end)
+ * - end_state:1
+ * (start:mtrr_start)(mtrr_start:end)
+ * - inclusive:1
+ * (start:mtrr_start)(mtrr_start:mtrr_end)(mtrr_end:end)
+ *
* depending on kind of overlap.
- * Return the type for first region and a pointer to
- * the start of second region so that caller will
- * lookup again on the second region.
+ *
+ * Return the type of the first region and a pointer
+ * to the start of next region so that caller will be
+ * advised to lookup again after having adjusted start
+ * and end.
+ *
* Note: This way we handle multiple overlaps as well.
*/
if (start_state)

2015-05-15 18:43:33

by Toshi Kani

[permalink] [raw]
Subject: [PATCH v5 3/6] mtrr, x86: Fix MTRR state checks in mtrr_type_lookup()

'mtrr_state.enabled' contains the FE (fixed MTRRs enabled)
and E (MTRRs enabled) flags in MSR_MTRRdefType. Intel SDM,
section 11.11.2.1, defines these flags as follows:
- All MTRRs are disabled when the E flag is clear.
The FE flag has no affect when the E flag is clear.
- The default type is enabled when the E flag is set.
- MTRR variable ranges are enabled when the E flag is set.
- MTRR fixed ranges are enabled when both E and FE flags
are set.

MTRR state checks in __mtrr_type_lookup() do not match with
SDM. Hence, this patch makes the following changes:
- The current code detects MTRRs disabled when both E and
FE flags are clear in mtrr_state.enabled. Fix to detect
MTRRs disabled when the E flag is clear.
- The current code does not check if the FE bit is set in
mtrr_state.enabled when looking into the fixed entries.
Fix to check the FE flag.
- The current code returns the default type when the E flag
is clear in mtrr_state.enabled. However, the default type
is also disabled when the E flag is clear. Fix to remove
the code as this case is handled as MTRR disabled with
the 1st change.

In addition, this patch defines the E and FE flags in
mtrr_state.enabled as follows.
- FE flag: MTRR_STATE_MTRR_FIXED_ENABLED
- E flag: MTRR_STATE_MTRR_ENABLED

print_mtrr_state() and x86_get_mtrr_mem_range() are also updated
accordingly.

Signed-off-by: Toshi Kani <[email protected]>
---
arch/x86/include/asm/mtrr.h | 4 ++++
arch/x86/kernel/cpu/mtrr/cleanup.c | 3 ++-
arch/x86/kernel/cpu/mtrr/generic.c | 15 ++++++++-------
3 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/mtrr.h b/arch/x86/include/asm/mtrr.h
index f768f62..ef92794 100644
--- a/arch/x86/include/asm/mtrr.h
+++ b/arch/x86/include/asm/mtrr.h
@@ -127,4 +127,8 @@ struct mtrr_gentry32 {
_IOW(MTRR_IOCTL_BASE, 9, struct mtrr_sentry32)
#endif /* CONFIG_COMPAT */

+/* Bit fields for enabled in struct mtrr_state_type */
+#define MTRR_STATE_MTRR_FIXED_ENABLED 0x01
+#define MTRR_STATE_MTRR_ENABLED 0x02
+
#endif /* _ASM_X86_MTRR_H */
diff --git a/arch/x86/kernel/cpu/mtrr/cleanup.c b/arch/x86/kernel/cpu/mtrr/cleanup.c
index 5f90b85..70d7c93 100644
--- a/arch/x86/kernel/cpu/mtrr/cleanup.c
+++ b/arch/x86/kernel/cpu/mtrr/cleanup.c
@@ -98,7 +98,8 @@ x86_get_mtrr_mem_range(struct range *range, int nr_range,
continue;
base = range_state[i].base_pfn;
if (base < (1<<(20-PAGE_SHIFT)) && mtrr_state.have_fixed &&
- (mtrr_state.enabled & 1)) {
+ (mtrr_state.enabled & MTRR_STATE_MTRR_ENABLED) &&
+ (mtrr_state.enabled & MTRR_STATE_MTRR_FIXED_ENABLED)) {
/* Var MTRR contains UC entry below 1M? Skip it: */
printk(BIOS_BUG_MSG, i);
if (base + size <= (1<<(20-PAGE_SHIFT)))
diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c
index e202d26..b0599db 100644
--- a/arch/x86/kernel/cpu/mtrr/generic.c
+++ b/arch/x86/kernel/cpu/mtrr/generic.c
@@ -119,14 +119,16 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
if (!mtrr_state_set)
return 0xFF;

- if (!mtrr_state.enabled)
+ if (!(mtrr_state.enabled & MTRR_STATE_MTRR_ENABLED))
return 0xFF;

/* Make end inclusive end, instead of exclusive */
end--;

/* Look in fixed ranges. Just return the type as per start */
- if (mtrr_state.have_fixed && (start < 0x100000)) {
+ if ((start < 0x100000) &&
+ (mtrr_state.have_fixed) &&
+ (mtrr_state.enabled & MTRR_STATE_MTRR_FIXED_ENABLED)) {
int idx;

if (start < 0x80000) {
@@ -149,9 +151,6 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
* Look of multiple ranges matching this address and pick type
* as per MTRR precedence
*/
- if (!(mtrr_state.enabled & 2))
- return mtrr_state.def_type;
-
prev_match = 0xFF;
for (i = 0; i < num_var_ranges; ++i) {
unsigned short start_state, end_state, inclusive;
@@ -355,7 +354,9 @@ static void __init print_mtrr_state(void)
mtrr_attrib_to_str(mtrr_state.def_type));
if (mtrr_state.have_fixed) {
pr_debug("MTRR fixed ranges %sabled:\n",
- mtrr_state.enabled & 1 ? "en" : "dis");
+ ((mtrr_state.enabled & MTRR_STATE_MTRR_ENABLED) &&
+ (mtrr_state.enabled & MTRR_STATE_MTRR_FIXED_ENABLED)) ?
+ "en" : "dis");
print_fixed(0x00000, 0x10000, mtrr_state.fixed_ranges + 0);
for (i = 0; i < 2; ++i)
print_fixed(0x80000 + i * 0x20000, 0x04000,
@@ -368,7 +369,7 @@ static void __init print_mtrr_state(void)
print_fixed_last();
}
pr_debug("MTRR variable ranges %sabled:\n",
- mtrr_state.enabled & 2 ? "en" : "dis");
+ mtrr_state.enabled & MTRR_STATE_MTRR_ENABLED ? "en" : "dis");
high_width = (__ffs64(size_or_mask) - (32 - PAGE_SHIFT) + 3) / 4;

for (i = 0; i < num_var_ranges; ++i) {

2015-05-15 18:43:38

by Toshi Kani

[permalink] [raw]
Subject: [PATCH v5 4/6] mtrr, x86: Define MTRR_TYPE_INVALID for mtrr_type_lookup()

mtrr_type_lookup() returns 0xFF when it cannot return a valid
MTRR memory type since MTRRs are disabled. This patch defines
MTRR_TYPE_INVALID to clarify the meaning of this value, and
documents its usage.

Document the return values of Kernel Virtual Address mapping
functions, pud_set_huge(), pmd_set_huge, pud_clear_huge() and
pmd_clear_huge().

There is no functional change in this patch.

Signed-off-by: Toshi Kani <[email protected]>
---
arch/x86/include/asm/mtrr.h | 2 +-
arch/x86/include/uapi/asm/mtrr.h | 8 ++++++-
arch/x86/kernel/cpu/mtrr/generic.c | 14 ++++++------
arch/x86/mm/pgtable.c | 42 +++++++++++++++++++++++++++---------
4 files changed, 47 insertions(+), 19 deletions(-)

diff --git a/arch/x86/include/asm/mtrr.h b/arch/x86/include/asm/mtrr.h
index ef92794..bb03a54 100644
--- a/arch/x86/include/asm/mtrr.h
+++ b/arch/x86/include/asm/mtrr.h
@@ -55,7 +55,7 @@ static inline u8 mtrr_type_lookup(u64 addr, u64 end)
/*
* Return no-MTRRs:
*/
- return 0xff;
+ return MTRR_TYPE_INVALID;
}
#define mtrr_save_fixed_ranges(arg) do {} while (0)
#define mtrr_save_state() do {} while (0)
diff --git a/arch/x86/include/uapi/asm/mtrr.h b/arch/x86/include/uapi/asm/mtrr.h
index d0acb65..7528dcf 100644
--- a/arch/x86/include/uapi/asm/mtrr.h
+++ b/arch/x86/include/uapi/asm/mtrr.h
@@ -103,7 +103,7 @@ struct mtrr_state_type {
#define MTRRIOC_GET_PAGE_ENTRY _IOWR(MTRR_IOCTL_BASE, 8, struct mtrr_gentry)
#define MTRRIOC_KILL_PAGE_ENTRY _IOW(MTRR_IOCTL_BASE, 9, struct mtrr_sentry)

-/* These are the region types */
+/* MTRR memory types, which are defined in SDM */
#define MTRR_TYPE_UNCACHABLE 0
#define MTRR_TYPE_WRCOMB 1
/*#define MTRR_TYPE_ 2*/
@@ -113,5 +113,11 @@ struct mtrr_state_type {
#define MTRR_TYPE_WRBACK 6
#define MTRR_NUM_TYPES 7

+/*
+ * Invalid MTRR memory type. mtrr_type_lookup() returns this value when
+ * MTRRs are disabled. Note, this value is allocated from the reserved
+ * values (0x7-0xff) of the MTRR memory types.
+ */
+#define MTRR_TYPE_INVALID 0xff

#endif /* _UAPI_ASM_X86_MTRR_H */
diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c
index b0599db..7b1491c 100644
--- a/arch/x86/kernel/cpu/mtrr/generic.c
+++ b/arch/x86/kernel/cpu/mtrr/generic.c
@@ -104,7 +104,7 @@ static int check_type_overlap(u8 *prev, u8 *curr)

/*
* Error/Semi-error returns:
- * 0xFF - when MTRR is not enabled
+ * MTRR_TYPE_INVALID - when MTRR is not enabled
* *repeat == 1 implies [start:end] spanned across MTRR range and type returned
* corresponds only to [start:*partial_end].
* Caller has to lookup again for [*partial_end:end].
@@ -117,10 +117,10 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)

*repeat = 0;
if (!mtrr_state_set)
- return 0xFF;
+ return MTRR_TYPE_INVALID;

if (!(mtrr_state.enabled & MTRR_STATE_MTRR_ENABLED))
- return 0xFF;
+ return MTRR_TYPE_INVALID;

/* Make end inclusive end, instead of exclusive */
end--;
@@ -151,7 +151,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
* Look of multiple ranges matching this address and pick type
* as per MTRR precedence
*/
- prev_match = 0xFF;
+ prev_match = MTRR_TYPE_INVALID;
for (i = 0; i < num_var_ranges; ++i) {
unsigned short start_state, end_state, inclusive;

@@ -206,7 +206,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
continue;

curr_match = mtrr_state.var_ranges[i].base_lo & 0xff;
- if (prev_match == 0xFF) {
+ if (prev_match == MTRR_TYPE_INVALID) {
prev_match = curr_match;
continue;
}
@@ -220,7 +220,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
return MTRR_TYPE_WRBACK;
}

- if (prev_match != 0xFF)
+ if (prev_match != MTRR_TYPE_INVALID)
return prev_match;

return mtrr_state.def_type;
@@ -229,7 +229,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
/*
* Returns the effective MTRR type for the region
* Error return:
- * 0xFF - when MTRR is not enabled
+ * MTRR_TYPE_INVALID - when MTRR is not enabled
*/
u8 mtrr_type_lookup(u64 start, u64 end)
{
diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
index 0b97d2c..c30f981 100644
--- a/arch/x86/mm/pgtable.c
+++ b/arch/x86/mm/pgtable.c
@@ -563,16 +563,22 @@ void native_set_fixmap(enum fixed_addresses idx, phys_addr_t phys,
}

#ifdef CONFIG_HAVE_ARCH_HUGE_VMAP
+/**
+ * pud_set_huge - setup kernel PUD mapping
+ *
+ * MTRR can override PAT memory types with 4KiB granularity. Therefore,
+ * this function does not set up a huge page when the range is covered
+ * by a non-WB type of MTRR. MTRR_TYPE_INVALID indicates that MTRR are
+ * disabled.
+ *
+ * Returns 1 on success and 0 on failure.
+ */
int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot)
{
u8 mtrr;

- /*
- * Do not use a huge page when the range is covered by non-WB type
- * of MTRRs.
- */
mtrr = mtrr_type_lookup(addr, addr + PUD_SIZE);
- if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != 0xFF))
+ if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != MTRR_TYPE_INVALID))
return 0;

prot = pgprot_4k_2_large(prot);
@@ -584,16 +590,22 @@ int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot)
return 1;
}

+/**
+ * pmd_set_huge - setup kernel PMD mapping
+ *
+ * MTRR can override PAT memory types with 4KiB granularity. Therefore,
+ * this function does not set up a huge page when the range is covered
+ * by a non-WB type of MTRR. MTRR_TYPE_INVALID indicates that MTRR are
+ * disabled.
+ *
+ * Returns 1 on success and 0 on failure.
+ */
int pmd_set_huge(pmd_t *pmd, phys_addr_t addr, pgprot_t prot)
{
u8 mtrr;

- /*
- * Do not use a huge page when the range is covered by non-WB type
- * of MTRRs.
- */
mtrr = mtrr_type_lookup(addr, addr + PMD_SIZE);
- if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != 0xFF))
+ if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != MTRR_TYPE_INVALID))
return 0;

prot = pgprot_4k_2_large(prot);
@@ -605,6 +617,11 @@ int pmd_set_huge(pmd_t *pmd, phys_addr_t addr, pgprot_t prot)
return 1;
}

+/**
+ * pud_clear_huge - clear kernel PUD mapping when it is set
+ *
+ * Returns 1 on success and 0 on failure (no PUD map is found).
+ */
int pud_clear_huge(pud_t *pud)
{
if (pud_large(*pud)) {
@@ -615,6 +632,11 @@ int pud_clear_huge(pud_t *pud)
return 0;
}

+/**
+ * pmd_clear_huge - clear kernel PMD mapping when it is set
+ *
+ * Returns 1 on success and 0 on failure (no PMD map is found).
+ */
int pmd_clear_huge(pmd_t *pmd)
{
if (pmd_large(*pmd)) {

2015-05-15 18:44:01

by Toshi Kani

[permalink] [raw]
Subject: [PATCH v5 5/6] mtrr, x86: Clean up mtrr_type_lookup()

MTRRs contain fixed and variable entries. mtrr_type_lookup()
may repeatedly call __mtrr_type_lookup() to handle a request
that overlaps with variable entries. However,
__mtrr_type_lookup() also handles the fixed entries, which
do not have to be repeated. Therefore, this patch creates
separate functions, mtrr_type_lookup_fixed() and
mtrr_type_lookup_variable(), to handle the fixed and variable
ranges respectively.

The patch also updates the function headers to clarify the
return values and output argument. It updates comments to
clarify that the repeating is necessary to handle overlaps
with the default type, since overlaps with multiple entries
alone can be handled without such repeating.

There is no functional change in this patch.

Signed-off-by: Toshi Kani <[email protected]>
---
arch/x86/kernel/cpu/mtrr/generic.c | 136 +++++++++++++++++++++++-------------
1 file changed, 85 insertions(+), 51 deletions(-)

diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c
index 7b1491c..c7d5245 100644
--- a/arch/x86/kernel/cpu/mtrr/generic.c
+++ b/arch/x86/kernel/cpu/mtrr/generic.c
@@ -102,55 +102,67 @@ static int check_type_overlap(u8 *prev, u8 *curr)
return 0;
}

-/*
- * Error/Semi-error returns:
- * MTRR_TYPE_INVALID - when MTRR is not enabled
- * *repeat == 1 implies [start:end] spanned across MTRR range and type returned
- * corresponds only to [start:*partial_end].
- * Caller has to lookup again for [*partial_end:end].
+/**
+ * mtrr_type_lookup_fixed - look up memory type in MTRR fixed entries
+ *
+ * Return the MTRR fixed memory type of 'start'.
+ *
+ * MTRR fixed entries are divided into the following ways:
+ * 0x00000 - 0x7FFFF : This range is divided into eight 64KB sub-ranges
+ * 0x80000 - 0xBFFFF : This range is divided into sixteen 16KB sub-ranges
+ * 0xC0000 - 0xFFFFF : This range is divided into sixty-four 4KB sub-ranges
+ *
+ * Return Values:
+ * MTRR_TYPE_(type) - Matched memory type
+ * MTRR_TYPE_INVALID - Unmatched
*/
-static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
+static u8 mtrr_type_lookup_fixed(u64 start, u64 end)
+{
+ int idx;
+
+ if (start >= 0x100000)
+ return MTRR_TYPE_INVALID;
+
+ if (start < 0x80000) { /* 0x0 - 0x7FFFF */
+ idx = 0;
+ idx += (start >> 16);
+ return mtrr_state.fixed_ranges[idx];
+
+ } else if (start < 0xC0000) { /* 0x80000 - 0xBFFFF */
+ idx = 1 * 8;
+ idx += ((start - 0x80000) >> 14);
+ return mtrr_state.fixed_ranges[idx];
+ }
+
+ /* 0xC0000 - 0xFFFFF */
+ idx = 3 * 8;
+ idx += ((start - 0xC0000) >> 12);
+ return mtrr_state.fixed_ranges[idx];
+}
+
+/**
+ * mtrr_type_lookup_variable - look up memory type in MTRR variable entries
+ *
+ * Return Value:
+ * MTRR_TYPE_(type) - Matched memory type or default memory type (unmatched)
+ *
+ * Output Argument:
+ * repeat - Set to 1 when [start:end] spanned across MTRR range and type
+ * returned corresponds only to [start:*partial_end]. Caller has
+ * to lookup again for [*partial_end:end].
+ */
+static u8 mtrr_type_lookup_variable(u64 start, u64 end, u64 *partial_end,
+ int *repeat)
{
int i;
u64 base, mask;
u8 prev_match, curr_match;

*repeat = 0;
- if (!mtrr_state_set)
- return MTRR_TYPE_INVALID;
-
- if (!(mtrr_state.enabled & MTRR_STATE_MTRR_ENABLED))
- return MTRR_TYPE_INVALID;

/* Make end inclusive end, instead of exclusive */
end--;

- /* Look in fixed ranges. Just return the type as per start */
- if ((start < 0x100000) &&
- (mtrr_state.have_fixed) &&
- (mtrr_state.enabled & MTRR_STATE_MTRR_FIXED_ENABLED)) {
- int idx;
-
- if (start < 0x80000) {
- idx = 0;
- idx += (start >> 16);
- return mtrr_state.fixed_ranges[idx];
- } else if (start < 0xC0000) {
- idx = 1 * 8;
- idx += ((start - 0x80000) >> 14);
- return mtrr_state.fixed_ranges[idx];
- } else {
- idx = 3 * 8;
- idx += ((start - 0xC0000) >> 12);
- return mtrr_state.fixed_ranges[idx];
- }
- }
-
- /*
- * Look in variable ranges
- * Look of multiple ranges matching this address and pick type
- * as per MTRR precedence
- */
prev_match = MTRR_TYPE_INVALID;
for (i = 0; i < num_var_ranges; ++i) {
unsigned short start_state, end_state, inclusive;
@@ -186,7 +198,8 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
* advised to lookup again after having adjusted start
* and end.
*
- * Note: This way we handle multiple overlaps as well.
+ * Note: This way we handle overlaps with multiple
+ * entries and the default type properly.
*/
if (start_state)
*partial_end = base + get_mtrr_size(mask);
@@ -215,21 +228,18 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
return curr_match;
}

- if (mtrr_tom2) {
- if (start >= (1ULL<<32) && (end < mtrr_tom2))
- return MTRR_TYPE_WRBACK;
- }
-
if (prev_match != MTRR_TYPE_INVALID)
return prev_match;

return mtrr_state.def_type;
}

-/*
- * Returns the effective MTRR type for the region
- * Error return:
- * MTRR_TYPE_INVALID - when MTRR is not enabled
+/**
+ * mtrr_type_lookup - look up memory type in MTRR
+ *
+ * Return Values:
+ * MTRR_TYPE_(type) - The effective MTRR type for the region
+ * MTRR_TYPE_INVALID - MTRR is disabled
*/
u8 mtrr_type_lookup(u64 start, u64 end)
{
@@ -237,22 +247,46 @@ u8 mtrr_type_lookup(u64 start, u64 end)
int repeat;
u64 partial_end;

- type = __mtrr_type_lookup(start, end, &partial_end, &repeat);
+ if (!mtrr_state_set)
+ return MTRR_TYPE_INVALID;
+
+ if (!(mtrr_state.enabled & MTRR_STATE_MTRR_ENABLED))
+ return MTRR_TYPE_INVALID;
+
+ /*
+ * Look up the fixed ranges first, which take priority over
+ * the variable ranges.
+ */
+ if ((start < 0x100000) &&
+ (mtrr_state.have_fixed) &&
+ (mtrr_state.enabled & MTRR_STATE_MTRR_FIXED_ENABLED))
+ return mtrr_type_lookup_fixed(start, end);
+
+ /*
+ * Look up the variable ranges. Look of multiple ranges matching
+ * this address and pick type as per MTRR precedence.
+ */
+ type = mtrr_type_lookup_variable(start, end, &partial_end, &repeat);

/*
* Common path is with repeat = 0.
* However, we can have cases where [start:end] spans across some
- * MTRR range. Do repeated lookups for that case here.
+ * MTRR ranges and/or the default type. Do repeated lookups for
+ * that case here.
*/
while (repeat) {
prev_type = type;
start = partial_end;
- type = __mtrr_type_lookup(start, end, &partial_end, &repeat);
+ type = mtrr_type_lookup_variable(start, end, &partial_end,
+ &repeat);

if (check_type_overlap(&prev_type, &type))
return type;
}

+ if (mtrr_tom2 && (start >= (1ULL<<32)) && (end < mtrr_tom2))
+ return MTRR_TYPE_WRBACK;
+
return type;
}

2015-05-15 18:43:43

by Toshi Kani

[permalink] [raw]
Subject: [PATCH v5 6/6] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping

This patch adds an additional argument, 'uniform', to
mtrr_type_lookup(), which returns 1 when a given range is
covered uniformly by MTRRs, i.e. the range is fully covered
by a single MTRR entry or the default type.

pud_set_huge() and pmd_set_huge() are changed to check the
new 'uniform' flag to see if it is safe to create a huge page
mapping to the range. This allows them to create a huge page
mapping to a range covered by a single MTRR entry of any
memory type. It also detects a non-optimal request properly.
They continue to check with the WB type since the WB type has
no effect even if a request spans multiple MTRR entries.

pmd_set_huge() logs a warning message to a non-optimal request
so that driver writers will be aware of such a case. Drivers
should make a mapping request aligned to a single MTRR entry
when the range is covered by MTRRs.

Signed-off-by: Toshi Kani <[email protected]>
---
arch/x86/include/asm/mtrr.h | 4 ++--
arch/x86/kernel/cpu/mtrr/generic.c | 37 ++++++++++++++++++++++++++----------
arch/x86/mm/pat.c | 4 ++--
arch/x86/mm/pgtable.c | 33 ++++++++++++++++++++------------
4 files changed, 52 insertions(+), 26 deletions(-)

diff --git a/arch/x86/include/asm/mtrr.h b/arch/x86/include/asm/mtrr.h
index bb03a54..a31759e 100644
--- a/arch/x86/include/asm/mtrr.h
+++ b/arch/x86/include/asm/mtrr.h
@@ -31,7 +31,7 @@
* arch_phys_wc_add and arch_phys_wc_del.
*/
# ifdef CONFIG_MTRR
-extern u8 mtrr_type_lookup(u64 addr, u64 end);
+extern u8 mtrr_type_lookup(u64 addr, u64 end, u8 *uniform);
extern void mtrr_save_fixed_ranges(void *);
extern void mtrr_save_state(void);
extern int mtrr_add(unsigned long base, unsigned long size,
@@ -50,7 +50,7 @@ extern int mtrr_trim_uncached_memory(unsigned long end_pfn);
extern int amd_special_default_mtrr(void);
extern int phys_wc_to_mtrr_index(int handle);
# else
-static inline u8 mtrr_type_lookup(u64 addr, u64 end)
+static inline u8 mtrr_type_lookup(u64 addr, u64 end, u8 *uniform)
{
/*
* Return no-MTRRs:
diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c
index c7d5245..7d347ac 100644
--- a/arch/x86/kernel/cpu/mtrr/generic.c
+++ b/arch/x86/kernel/cpu/mtrr/generic.c
@@ -146,19 +146,22 @@ static u8 mtrr_type_lookup_fixed(u64 start, u64 end)
* Return Value:
* MTRR_TYPE_(type) - Matched memory type or default memory type (unmatched)
*
- * Output Argument:
+ * Output Arguments:
* repeat - Set to 1 when [start:end] spanned across MTRR range and type
* returned corresponds only to [start:*partial_end]. Caller has
* to lookup again for [*partial_end:end].
+ * uniform - Set to 1 when MTRR covers the region uniformly, i.e. the region
+ * is fully covered by a single MTRR entry or the default type.
*/
static u8 mtrr_type_lookup_variable(u64 start, u64 end, u64 *partial_end,
- int *repeat)
+ int *repeat, u8 *uniform)
{
int i;
u64 base, mask;
u8 prev_match, curr_match;

*repeat = 0;
+ *uniform = 1;

/* Make end inclusive end, instead of exclusive */
end--;
@@ -213,6 +216,7 @@ static u8 mtrr_type_lookup_variable(u64 start, u64 end, u64 *partial_end,

end = *partial_end - 1; /* end is inclusive */
*repeat = 1;
+ *uniform = 0;
}

if ((start & mask) != (base & mask))
@@ -224,6 +228,7 @@ static u8 mtrr_type_lookup_variable(u64 start, u64 end, u64 *partial_end,
continue;
}

+ *uniform = 0;
if (check_type_overlap(&prev_match, &curr_match))
return curr_match;
}
@@ -240,10 +245,14 @@ static u8 mtrr_type_lookup_variable(u64 start, u64 end, u64 *partial_end,
* Return Values:
* MTRR_TYPE_(type) - The effective MTRR type for the region
* MTRR_TYPE_INVALID - MTRR is disabled
+ *
+ * Output Argument:
+ * uniform - Set to 1 when MTRR covers the region uniformly, i.e. the region
+ * is fully covered by a single MTRR entry or the default type.
*/
-u8 mtrr_type_lookup(u64 start, u64 end)
+u8 mtrr_type_lookup(u64 start, u64 end, u8 *uniform)
{
- u8 type, prev_type;
+ u8 type, prev_type, is_uniform = 1, dummy;
int repeat;
u64 partial_end;

@@ -259,14 +268,18 @@ u8 mtrr_type_lookup(u64 start, u64 end)
*/
if ((start < 0x100000) &&
(mtrr_state.have_fixed) &&
- (mtrr_state.enabled & MTRR_STATE_MTRR_FIXED_ENABLED))
- return mtrr_type_lookup_fixed(start, end);
+ (mtrr_state.enabled & MTRR_STATE_MTRR_FIXED_ENABLED)) {
+ is_uniform = 0;
+ type = mtrr_type_lookup_fixed(start, end);
+ goto out;
+ }

/*
* Look up the variable ranges. Look of multiple ranges matching
* this address and pick type as per MTRR precedence.
*/
- type = mtrr_type_lookup_variable(start, end, &partial_end, &repeat);
+ type = mtrr_type_lookup_variable(start, end, &partial_end,
+ &repeat, &is_uniform);

/*
* Common path is with repeat = 0.
@@ -277,16 +290,20 @@ u8 mtrr_type_lookup(u64 start, u64 end)
while (repeat) {
prev_type = type;
start = partial_end;
+ is_uniform = 0;
+
type = mtrr_type_lookup_variable(start, end, &partial_end,
- &repeat);
+ &repeat, &dummy);

if (check_type_overlap(&prev_type, &type))
- return type;
+ goto out;
}

if (mtrr_tom2 && (start >= (1ULL<<32)) && (end < mtrr_tom2))
- return MTRR_TYPE_WRBACK;
+ type = MTRR_TYPE_WRBACK;

+out:
+ *uniform = is_uniform;
return type;
}

diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index 35af677..372ad42 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -267,9 +267,9 @@ static unsigned long pat_x_mtrr_type(u64 start, u64 end,
* request is for WB.
*/
if (req_type == _PAGE_CACHE_MODE_WB) {
- u8 mtrr_type;
+ u8 mtrr_type, uniform;

- mtrr_type = mtrr_type_lookup(start, end);
+ mtrr_type = mtrr_type_lookup(start, end, &uniform);
if (mtrr_type != MTRR_TYPE_WRBACK)
return _PAGE_CACHE_MODE_UC_MINUS;

diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
index c30f981..3fa0eb9 100644
--- a/arch/x86/mm/pgtable.c
+++ b/arch/x86/mm/pgtable.c
@@ -567,18 +567,21 @@ void native_set_fixmap(enum fixed_addresses idx, phys_addr_t phys,
* pud_set_huge - setup kernel PUD mapping
*
* MTRR can override PAT memory types with 4KiB granularity. Therefore,
- * this function does not set up a huge page when the range is covered
- * by a non-WB type of MTRR. MTRR_TYPE_INVALID indicates that MTRR are
- * disabled.
+ * this function only sets up a huge page in the following conditions.
+ * - MTRR is disabled.
+ * - The range is mapped uniformly by MTRR, i.e. the range is fully covered
+ * by a single MTRR entry or the default type.
+ * - The MTRR memory type is WB.
*
* Returns 1 on success and 0 on failure.
*/
int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot)
{
- u8 mtrr;
+ u8 mtrr, uniform;

- mtrr = mtrr_type_lookup(addr, addr + PUD_SIZE);
- if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != MTRR_TYPE_INVALID))
+ mtrr = mtrr_type_lookup(addr, addr + PUD_SIZE, &uniform);
+ if ((mtrr != MTRR_TYPE_INVALID) && (!uniform) &&
+ (mtrr != MTRR_TYPE_WRBACK))
return 0;

prot = pgprot_4k_2_large(prot);
@@ -594,19 +597,25 @@ int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot)
* pmd_set_huge - setup kernel PMD mapping
*
* MTRR can override PAT memory types with 4KiB granularity. Therefore,
- * this function does not set up a huge page when the range is covered
- * by a non-WB type of MTRR. MTRR_TYPE_INVALID indicates that MTRR are
- * disabled.
+ * this function only sets up a huge page in the following conditions.
+ * - MTRR is disabled.
+ * - The range is mapped uniformly by MTRR, i.e. the range is fully covered
+ * by a single MTRR entry or the default type.
+ * - The MTRR memory type is WB.
*
* Returns 1 on success and 0 on failure.
*/
int pmd_set_huge(pmd_t *pmd, phys_addr_t addr, pgprot_t prot)
{
- u8 mtrr;
+ u8 mtrr, uniform;

- mtrr = mtrr_type_lookup(addr, addr + PMD_SIZE);
- if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != MTRR_TYPE_INVALID))
+ mtrr = mtrr_type_lookup(addr, addr + PMD_SIZE, &uniform);
+ if ((mtrr != MTRR_TYPE_INVALID) && (!uniform) &&
+ (mtrr != MTRR_TYPE_WRBACK)) {
+ pr_warn_once("pmd_set_huge: requesting [mem %#010llx-%#010llx], which spans more than a single MTRR entry\n",
+ addr, addr + PMD_SIZE);
return 0;
+ }

prot = pgprot_4k_2_large(prot);

2015-05-17 08:30:48

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v5 1/6] mm, x86: Simplify conditions of HAVE_ARCH_HUGE_VMAP

On Fri, May 15, 2015 at 12:23:52PM -0600, Toshi Kani wrote:
> Simplify the conditions to select HAVE_ARCH_HUGE_VMAP
> in arch/x86/Kconfig since X86_PAE depends on X86_32.
>
> Signed-off-by: Toshi Kani <[email protected]>
> ---
> arch/x86/Kconfig | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index 8fec044..73a4d03 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -100,7 +100,7 @@ config X86
> select IRQ_FORCED_THREADING
> select HAVE_BPF_JIT if X86_64
> select HAVE_ARCH_TRANSPARENT_HUGEPAGE
> - select HAVE_ARCH_HUGE_VMAP if X86_64 || (X86_32 && X86_PAE)
> + select HAVE_ARCH_HUGE_VMAP if X86_64 || X86_PAE
> select ARCH_HAS_SG_CHAIN
> select CLKEVT_I8253
> select ARCH_HAVE_NMI_SAFE_CMPXCHG

Applied, thanks.

--
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--

2015-05-18 13:34:14

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v5 6/6] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping

On Fri, May 15, 2015 at 12:23:57PM -0600, Toshi Kani wrote:
> This patch adds an additional argument, 'uniform', to
> mtrr_type_lookup(), which returns 1 when a given range is
> covered uniformly by MTRRs, i.e. the range is fully covered
> by a single MTRR entry or the default type.
>
> pud_set_huge() and pmd_set_huge() are changed to check the
> new 'uniform' flag to see if it is safe to create a huge page
> mapping to the range. This allows them to create a huge page
> mapping to a range covered by a single MTRR entry of any
> memory type. It also detects a non-optimal request properly.
> They continue to check with the WB type since the WB type has
> no effect even if a request spans multiple MTRR entries.
>
> pmd_set_huge() logs a warning message to a non-optimal request
> so that driver writers will be aware of such a case. Drivers
> should make a mapping request aligned to a single MTRR entry
> when the range is covered by MTRRs.
>
> Signed-off-by: Toshi Kani <[email protected]>
> ---
> arch/x86/include/asm/mtrr.h | 4 ++--
> arch/x86/kernel/cpu/mtrr/generic.c | 37 ++++++++++++++++++++++++++----------
> arch/x86/mm/pat.c | 4 ++--
> arch/x86/mm/pgtable.c | 33 ++++++++++++++++++++------------
> 4 files changed, 52 insertions(+), 26 deletions(-)

...

> int pmd_set_huge(pmd_t *pmd, phys_addr_t addr, pgprot_t prot)
> {
> - u8 mtrr;
> + u8 mtrr, uniform;
>
> - mtrr = mtrr_type_lookup(addr, addr + PMD_SIZE);
> - if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != MTRR_TYPE_INVALID))
> + mtrr = mtrr_type_lookup(addr, addr + PMD_SIZE, &uniform);
> + if ((mtrr != MTRR_TYPE_INVALID) && (!uniform) &&
> + (mtrr != MTRR_TYPE_WRBACK)) {
> + pr_warn_once("pmd_set_huge: requesting [mem %#010llx-%#010llx], which spans more than a single MTRR entry\n",
> + addr, addr + PMD_SIZE);
> return 0;
> + }

All applied, I reformatted the comments in this last one a bit and made
the warning message hopefully a bit more descriptive:

---
From: Toshi Kani <[email protected]>
Date: Fri, 15 May 2015 12:23:57 -0600
Subject: [PATCH] x86/mm: Enhance MTRR checks in kernel mapping helpers

This patch adds the argument 'uniform' to mtrr_type_lookup(), which gets
set to 1 when a given range is covered uniformly by MTRRs, i.e. the
range is fully covered by a single MTRR entry or the default type.

Change pud_set_huge() and pmd_set_huge() to honor the 'uniform' flag to
see if it is safe to create a huge page mapping in the range.

This allows them to create a huge page mapping in a range covered by
a single MTRR entry of any memory type. It also detects a non-optimal
request properly. They continue to check with the WB type since it does
not effectively change the uniform mapping even if a request spans
multiple MTRR entries.

pmd_set_huge() logs a warning message to a non-optimal request so that
driver writers will be aware of such a case. Drivers should make a
mapping request aligned to a single MTRR entry when the range is covered
by MTRRs.

Signed-off-by: Toshi Kani <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: Luis R. Rodriguez <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: linux-mm <[email protected]>
Cc: x86-ml <[email protected]>
Cc: lkml <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
[ Realign comments, improve warning message. ]
Signed-off-by: Borislav Petkov <[email protected]>
---
arch/x86/include/asm/mtrr.h | 4 ++--
arch/x86/kernel/cpu/mtrr/generic.c | 40 +++++++++++++++++++++++++++----------
arch/x86/mm/pat.c | 4 ++--
arch/x86/mm/pgtable.c | 41 +++++++++++++++++++++++++-------------
4 files changed, 61 insertions(+), 28 deletions(-)

diff --git a/arch/x86/include/asm/mtrr.h b/arch/x86/include/asm/mtrr.h
index bb03a547c1ab..a31759e1edd9 100644
--- a/arch/x86/include/asm/mtrr.h
+++ b/arch/x86/include/asm/mtrr.h
@@ -31,7 +31,7 @@
* arch_phys_wc_add and arch_phys_wc_del.
*/
# ifdef CONFIG_MTRR
-extern u8 mtrr_type_lookup(u64 addr, u64 end);
+extern u8 mtrr_type_lookup(u64 addr, u64 end, u8 *uniform);
extern void mtrr_save_fixed_ranges(void *);
extern void mtrr_save_state(void);
extern int mtrr_add(unsigned long base, unsigned long size,
@@ -50,7 +50,7 @@ extern int mtrr_trim_uncached_memory(unsigned long end_pfn);
extern int amd_special_default_mtrr(void);
extern int phys_wc_to_mtrr_index(int handle);
# else
-static inline u8 mtrr_type_lookup(u64 addr, u64 end)
+static inline u8 mtrr_type_lookup(u64 addr, u64 end, u8 *uniform)
{
/*
* Return no-MTRRs:
diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c
index e51100c49eea..f782d9b62cb3 100644
--- a/arch/x86/kernel/cpu/mtrr/generic.c
+++ b/arch/x86/kernel/cpu/mtrr/generic.c
@@ -147,19 +147,24 @@ static u8 mtrr_type_lookup_fixed(u64 start, u64 end)
* Return Value:
* MTRR_TYPE_(type) - Matched memory type or default memory type (unmatched)
*
- * Output Argument:
+ * Output Arguments:
* repeat - Set to 1 when [start:end] spanned across MTRR range and type
* returned corresponds only to [start:*partial_end]. Caller has
* to lookup again for [*partial_end:end].
+ *
+ * uniform - Set to 1 when an MTRR covers the region uniformly, i.e. the
+ * region is fully covered by a single MTRR entry or the default
+ * type.
*/
static u8 mtrr_type_lookup_variable(u64 start, u64 end, u64 *partial_end,
- int *repeat)
+ int *repeat, u8 *uniform)
{
int i;
u64 base, mask;
u8 prev_match, curr_match;

*repeat = 0;
+ *uniform = 1;

/* Make end inclusive instead of exclusive */
end--;
@@ -214,6 +219,7 @@ static u8 mtrr_type_lookup_variable(u64 start, u64 end, u64 *partial_end,

end = *partial_end - 1; /* end is inclusive */
*repeat = 1;
+ *uniform = 0;
}

if ((start & mask) != (base & mask))
@@ -225,6 +231,7 @@ static u8 mtrr_type_lookup_variable(u64 start, u64 end, u64 *partial_end,
continue;
}

+ *uniform = 0;
if (check_type_overlap(&prev_match, &curr_match))
return curr_match;
}
@@ -241,10 +248,15 @@ static u8 mtrr_type_lookup_variable(u64 start, u64 end, u64 *partial_end,
* Return Values:
* MTRR_TYPE_(type) - The effective MTRR type for the region
* MTRR_TYPE_INVALID - MTRR is disabled
+ *
+ * Output Argument:
+ * uniform - Set to 1 when an MTRR covers the region uniformly, i.e. the
+ * region is fully covered by a single MTRR entry or the default
+ * type.
*/
-u8 mtrr_type_lookup(u64 start, u64 end)
+u8 mtrr_type_lookup(u64 start, u64 end, u8 *uniform)
{
- u8 type, prev_type;
+ u8 type, prev_type, is_uniform = 1, dummy;
int repeat;
u64 partial_end;

@@ -260,14 +272,18 @@ u8 mtrr_type_lookup(u64 start, u64 end)
*/
if ((start < 0x100000) &&
(mtrr_state.have_fixed) &&
- (mtrr_state.enabled & MTRR_STATE_MTRR_FIXED_ENABLED))
- return mtrr_type_lookup_fixed(start, end);
+ (mtrr_state.enabled & MTRR_STATE_MTRR_FIXED_ENABLED)) {
+ is_uniform = 0;
+ type = mtrr_type_lookup_fixed(start, end);
+ goto out;
+ }

/*
* Look up the variable ranges. Look of multiple ranges matching
* this address and pick type as per MTRR precedence.
*/
- type = mtrr_type_lookup_variable(start, end, &partial_end, &repeat);
+ type = mtrr_type_lookup_variable(start, end, &partial_end,
+ &repeat, &is_uniform);

/*
* Common path is with repeat = 0.
@@ -278,15 +294,19 @@ u8 mtrr_type_lookup(u64 start, u64 end)
while (repeat) {
prev_type = type;
start = partial_end;
- type = mtrr_type_lookup_variable(start, end, &partial_end, &repeat);
+ is_uniform = 0;
+ type = mtrr_type_lookup_variable(start, end, &partial_end,
+ &repeat, &dummy);

if (check_type_overlap(&prev_type, &type))
- return type;
+ goto out;
}

if (mtrr_tom2 && (start >= (1ULL<<32)) && (end < mtrr_tom2))
- return MTRR_TYPE_WRBACK;
+ type = MTRR_TYPE_WRBACK;

+out:
+ *uniform = is_uniform;
return type;
}

diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index 35af6771a95a..372ad422c2c3 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -267,9 +267,9 @@ static unsigned long pat_x_mtrr_type(u64 start, u64 end,
* request is for WB.
*/
if (req_type == _PAGE_CACHE_MODE_WB) {
- u8 mtrr_type;
+ u8 mtrr_type, uniform;

- mtrr_type = mtrr_type_lookup(start, end);
+ mtrr_type = mtrr_type_lookup(start, end, &uniform);
if (mtrr_type != MTRR_TYPE_WRBACK)
return _PAGE_CACHE_MODE_UC_MINUS;

diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
index c30f9819786b..f1894daa79ee 100644
--- a/arch/x86/mm/pgtable.c
+++ b/arch/x86/mm/pgtable.c
@@ -566,19 +566,24 @@ void native_set_fixmap(enum fixed_addresses idx, phys_addr_t phys,
/**
* pud_set_huge - setup kernel PUD mapping
*
- * MTRR can override PAT memory types with 4KiB granularity. Therefore,
- * this function does not set up a huge page when the range is covered
- * by a non-WB type of MTRR. MTRR_TYPE_INVALID indicates that MTRR are
- * disabled.
+ * MTRRs can override PAT memory types with 4KiB granularity. Therefore,
+ * this function sets up a huge page only if all of the following
+ * conditions are met:
+ *
+ * - MTRRs are disabled.
+ * - The range is mapped uniformly by an MTRR, i.e. the range is
+ * fully covered by a single MTRR entry or the default type.
+ * - The MTRR memory type is WB.
*
* Returns 1 on success and 0 on failure.
*/
int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot)
{
- u8 mtrr;
+ u8 mtrr, uniform;

- mtrr = mtrr_type_lookup(addr, addr + PUD_SIZE);
- if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != MTRR_TYPE_INVALID))
+ mtrr = mtrr_type_lookup(addr, addr + PUD_SIZE, &uniform);
+ if ((mtrr != MTRR_TYPE_INVALID) && (!uniform) &&
+ (mtrr != MTRR_TYPE_WRBACK))
return 0;

prot = pgprot_4k_2_large(prot);
@@ -593,20 +598,28 @@ int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot)
/**
* pmd_set_huge - setup kernel PMD mapping
*
- * MTRR can override PAT memory types with 4KiB granularity. Therefore,
- * this function does not set up a huge page when the range is covered
- * by a non-WB type of MTRR. MTRR_TYPE_INVALID indicates that MTRR are
- * disabled.
+ * MTRRs can override PAT memory types with 4KiB granularity. Therefore,
+ * this function sets up a huge page only if all of the following
+ * conditions are met:
+ *
+ * - MTRR is disabled.
+ * - The range is mapped uniformly by an MTRR, i.e. the range is
+ * fully covered by a single MTRR entry or the default type.
+ * - The MTRR memory type is WB.
*
* Returns 1 on success and 0 on failure.
*/
int pmd_set_huge(pmd_t *pmd, phys_addr_t addr, pgprot_t prot)
{
- u8 mtrr;
+ u8 mtrr, uniform;

- mtrr = mtrr_type_lookup(addr, addr + PMD_SIZE);
- if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != MTRR_TYPE_INVALID))
+ mtrr = mtrr_type_lookup(addr, addr + PMD_SIZE, &uniform);
+ if ((mtrr != MTRR_TYPE_INVALID) && (!uniform) &&
+ (mtrr != MTRR_TYPE_WRBACK)) {
+ pr_warn_once("%s: Cannot satisfy [mem %#010llx-%#010llx] with a huge-page mapping due to MTRR override.\n",
+ __func__, addr, addr + PMD_SIZE);
return 0;
+ }

prot = pgprot_4k_2_large(prot);

--
2.3.5

--
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--

2015-05-18 17:42:05

by Toshi Kani

[permalink] [raw]
Subject: Re: [PATCH v5 6/6] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping

On Mon, 2015-05-18 at 15:33 +0200, Borislav Petkov wrote:
> On Fri, May 15, 2015 at 12:23:57PM -0600, Toshi Kani wrote:
> > This patch adds an additional argument, 'uniform', to
> > mtrr_type_lookup(), which returns 1 when a given range is
> > covered uniformly by MTRRs, i.e. the range is fully covered
> > by a single MTRR entry or the default type.
> >
> > pud_set_huge() and pmd_set_huge() are changed to check the
> > new 'uniform' flag to see if it is safe to create a huge page
> > mapping to the range. This allows them to create a huge page
> > mapping to a range covered by a single MTRR entry of any
> > memory type. It also detects a non-optimal request properly.
> > They continue to check with the WB type since the WB type has
> > no effect even if a request spans multiple MTRR entries.
> >
> > pmd_set_huge() logs a warning message to a non-optimal request
> > so that driver writers will be aware of such a case. Drivers
> > should make a mapping request aligned to a single MTRR entry
> > when the range is covered by MTRRs.
> >
> > Signed-off-by: Toshi Kani <[email protected]>
> > ---
> > arch/x86/include/asm/mtrr.h | 4 ++--
> > arch/x86/kernel/cpu/mtrr/generic.c | 37 ++++++++++++++++++++++++++----------
> > arch/x86/mm/pat.c | 4 ++--
> > arch/x86/mm/pgtable.c | 33 ++++++++++++++++++++------------
> > 4 files changed, 52 insertions(+), 26 deletions(-)
:
>
> All applied,

Great!

> I reformatted the comments in this last one a bit and made
> the warning message hopefully a bit more descriptive:

I have a few comments below.

> diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
> index c30f9819786b..f1894daa79ee 100644
> --- a/arch/x86/mm/pgtable.c
> +++ b/arch/x86/mm/pgtable.c
> @@ -566,19 +566,24 @@ void native_set_fixmap(enum fixed_addresses idx, phys_addr_t phys,
> /**
> * pud_set_huge - setup kernel PUD mapping
> *
> - * MTRR can override PAT memory types with 4KiB granularity. Therefore,
> - * this function does not set up a huge page when the range is covered
> - * by a non-WB type of MTRR. MTRR_TYPE_INVALID indicates that MTRR are
> - * disabled.
> + * MTRRs can override PAT memory types with 4KiB granularity. Therefore,
> + * this function sets up a huge page only if all of the following
> + * conditions are met:

It should be "if any of the following condition is met". Or, does NOT
setup if all of ...

> + *
> + * - MTRRs are disabled.
> + * - The range is mapped uniformly by an MTRR, i.e. the range is
> + * fully covered by a single MTRR entry or the default type.
> + * - The MTRR memory type is WB.
> *
> * Returns 1 on success and 0 on failure.
> */
> int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot)
> {
> - u8 mtrr;
> + u8 mtrr, uniform;
>
> - mtrr = mtrr_type_lookup(addr, addr + PUD_SIZE);
> - if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != MTRR_TYPE_INVALID))
> + mtrr = mtrr_type_lookup(addr, addr + PUD_SIZE, &uniform);
> + if ((mtrr != MTRR_TYPE_INVALID) && (!uniform) &&
> + (mtrr != MTRR_TYPE_WRBACK))
> return 0;
>
> prot = pgprot_4k_2_large(prot);
> @@ -593,20 +598,28 @@ int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot)
> /**
> * pmd_set_huge - setup kernel PMD mapping
> *
> - * MTRR can override PAT memory types with 4KiB granularity. Therefore,
> - * this function does not set up a huge page when the range is covered
> - * by a non-WB type of MTRR. MTRR_TYPE_INVALID indicates that MTRR are
> - * disabled.
> + * MTRRs can override PAT memory types with 4KiB granularity. Therefore,
> + * this function sets up a huge page only if all of the following
> + * conditions are met:

Ditto.

> + *
> + * - MTRR is disabled.
> + * - The range is mapped uniformly by an MTRR, i.e. the range is
> + * fully covered by a single MTRR entry or the default type.
> + * - The MTRR memory type is WB.
> *
> * Returns 1 on success and 0 on failure.
> */
> int pmd_set_huge(pmd_t *pmd, phys_addr_t addr, pgprot_t prot)
> {
> - u8 mtrr;
> + u8 mtrr, uniform;
>
> - mtrr = mtrr_type_lookup(addr, addr + PMD_SIZE);
> - if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != MTRR_TYPE_INVALID))
> + mtrr = mtrr_type_lookup(addr, addr + PMD_SIZE, &uniform);
> + if ((mtrr != MTRR_TYPE_INVALID) && (!uniform) &&
> + (mtrr != MTRR_TYPE_WRBACK)) {
> + pr_warn_once("%s: Cannot satisfy [mem %#010llx-%#010llx] with a huge-page mapping due to MTRR override.\n",
> + __func__, addr, addr + PMD_SIZE);

This new message looks good.

Thanks,
-Toshi

2015-05-18 19:02:00

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v5 6/6] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping

On Mon, May 18, 2015 at 11:22:39AM -0600, Toshi Kani wrote:
> > diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
> > index c30f9819786b..f1894daa79ee 100644
> > --- a/arch/x86/mm/pgtable.c
> > +++ b/arch/x86/mm/pgtable.c
> > @@ -566,19 +566,24 @@ void native_set_fixmap(enum fixed_addresses idx, phys_addr_t phys,
> > /**
> > * pud_set_huge - setup kernel PUD mapping
> > *
> > - * MTRR can override PAT memory types with 4KiB granularity. Therefore,
> > - * this function does not set up a huge page when the range is covered
> > - * by a non-WB type of MTRR. MTRR_TYPE_INVALID indicates that MTRR are
> > - * disabled.
> > + * MTRRs can override PAT memory types with 4KiB granularity. Therefore,
> > + * this function sets up a huge page only if all of the following
> > + * conditions are met:
>
> It should be "if any of the following condition is met". Or, does NOT
> setup if all of ...
>
> > + *
> > + * - MTRRs are disabled.
> > + * - The range is mapped uniformly by an MTRR, i.e. the range is
> > + * fully covered by a single MTRR entry or the default type.
> > + * - The MTRR memory type is WB.

Hmm, ok, so this is kinda like "any" but they also depend on each other.
So it is

If
- MTRRs are disabled

or

- MTRRs are enabled and the range is completely covered by a single MTRR

or

- MTRRs are enabled and the range is not completely covered by a
single MTRR but the memory type of the range is WB, even if covered by
multiple MTRRs.

Right?

So tell me this: why do we need to repeat that over those KVA helpers?
It's not like the callers can do anything about it, can they?

So maybe that comment - expanded into more detail - should be over
mtrr_type_lookup() only. That'll be better, methinks.

Hmm.

--
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--

2015-05-18 19:51:22

by Toshi Kani

[permalink] [raw]
Subject: Re: [PATCH v5 6/6] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping

On Mon, 2015-05-18 at 21:01 +0200, Borislav Petkov wrote:
> On Mon, May 18, 2015 at 11:22:39AM -0600, Toshi Kani wrote:
> > > diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
> > > index c30f9819786b..f1894daa79ee 100644
> > > --- a/arch/x86/mm/pgtable.c
> > > +++ b/arch/x86/mm/pgtable.c
> > > @@ -566,19 +566,24 @@ void native_set_fixmap(enum fixed_addresses idx, phys_addr_t phys,
> > > /**
> > > * pud_set_huge - setup kernel PUD mapping
> > > *
> > > - * MTRR can override PAT memory types with 4KiB granularity. Therefore,
> > > - * this function does not set up a huge page when the range is covered
> > > - * by a non-WB type of MTRR. MTRR_TYPE_INVALID indicates that MTRR are
> > > - * disabled.
> > > + * MTRRs can override PAT memory types with 4KiB granularity. Therefore,
> > > + * this function sets up a huge page only if all of the following
> > > + * conditions are met:
> >
> > It should be "if any of the following condition is met". Or, does NOT
> > setup if all of ...
> >
> > > + *
> > > + * - MTRRs are disabled.
> > > + * - The range is mapped uniformly by an MTRR, i.e. the range is
> > > + * fully covered by a single MTRR entry or the default type.
> > > + * - The MTRR memory type is WB.
>
> Hmm, ok, so this is kinda like "any" but they also depend on each other.
> So it is
>
> If
> - MTRRs are disabled
>
> or
>
> - MTRRs are enabled and the range is completely covered by a single MTRR
>
> or
>
> - MTRRs are enabled and the range is not completely covered by a
> single MTRR but the memory type of the range is WB, even if covered by
> multiple MTRRs.
>
> Right?

Well, #2 and #3 are independent. That is, uniform can be set regardless
of a type value, and WB can be returned regardless of a uniform value.

#1 is a new condition added per your comment that uniform no longer
covers the MTRR disabled case. Yes, #2 and #3 depend on #1 being false.

> So tell me this: why do we need to repeat that over those KVA helpers?
> It's not like the callers can do anything about it, can they?
>
> So maybe that comment - expanded into more detail - should be over
> mtrr_type_lookup() only. That'll be better, methinks.

The caller is responsible for verifying the conditions that are safe to
create huge page. So, I think the comments are needed here to state
such conditions.

Thanks,
-Toshi

2015-05-18 20:01:25

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v5 6/6] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping

On Mon, May 18, 2015 at 01:31:59PM -0600, Toshi Kani wrote:
> Well, #2 and #3 are independent. That is, uniform can be set regardless

Not #2 and #3 above - the original #2 and #3 ones. I've written them out
detailed to show what I mean.

> The caller is responsible for verifying the conditions that are safe to
> create huge page.

How is the caller ever going to be able to do anything about it?

Regardless, I'd prefer to not duplicate comments and rather put a short
sentence pointing the reader to the comments over mtrr_type_lookup()
where this all is being explained in detail.

I'll fix it up.

--
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--

2015-05-18 20:40:34

by Toshi Kani

[permalink] [raw]
Subject: Re: [PATCH v5 6/6] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping

On Mon, 2015-05-18 at 22:01 +0200, Borislav Petkov wrote:
> On Mon, May 18, 2015 at 01:31:59PM -0600, Toshi Kani wrote:
> > Well, #2 and #3 are independent. That is, uniform can be set regardless
>
> Not #2 and #3 above - the original #2 and #3 ones. I've written them out
> detailed to show what I mean.

The original #2 and #3 are set independently as well. They do not depend
on each other condition being a specific value.

> > The caller is responsible for verifying the conditions that are safe to
> > create huge page.
>
> How is the caller ever going to be able to do anything about it?

The caller is the one who makes the condition checks necessary to create
a huge page mapping. mtrr_type_look() only returns how the given range
is related with MTRRs.

> Regardless, I'd prefer to not duplicate comments and rather put a short
> sentence pointing the reader to the comments over mtrr_type_lookup()
> where this all is being explained in detail.
>
> I'll fix it up.

I appreciate your help.

Thanks,
-Toshi

2015-05-18 20:51:36

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v5 6/6] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping

On Mon, May 18, 2015 at 02:21:08PM -0600, Toshi Kani wrote:
> The caller is the one who makes the condition checks necessary to create
> a huge page mapping.

How? It would go and change MTRRs configuration and ranges and their
memory types so that a huge mapping succeeds?

Or go and try a different range?

--
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--

2015-05-18 22:12:37

by Toshi Kani

[permalink] [raw]
Subject: Re: [PATCH v5 6/6] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping

On Mon, 2015-05-18 at 22:51 +0200, Borislav Petkov wrote:
> On Mon, May 18, 2015 at 02:21:08PM -0600, Toshi Kani wrote:
> > The caller is the one who makes the condition checks necessary to create
> > a huge page mapping.
>
> How? It would go and change MTRRs configuration and ranges and their
> memory types so that a huge mapping succeeds?
>
> Or go and try a different range?

Try with a smaller page size.

The callers, pud_set_huge() and pmd_set_huge(), check if the given range
is safe with MTRRs for creating a huge page mapping. If not, they fail
the request, which leads their callers, ioremap_pud_range() and
ioremap_pmd_range(), to retry with a smaller page size, i.e. 1GB -> 2MB
-> 4KB. 4KB may not have overlap with MTRRs (hence no checking is
necessary), which will succeed as before.

Thanks,
-Toshi



2015-05-19 11:44:49

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v5 6/6] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping

On Mon, May 18, 2015 at 03:53:14PM -0600, Toshi Kani wrote:
> On Mon, 2015-05-18 at 22:51 +0200, Borislav Petkov wrote:
> > On Mon, May 18, 2015 at 02:21:08PM -0600, Toshi Kani wrote:
> > > The caller is the one who makes the condition checks necessary to create
> > > a huge page mapping.
> >
> > How? It would go and change MTRRs configuration and ranges and their
> > memory types so that a huge mapping succeeds?
> >
> > Or go and try a different range?
>
> Try with a smaller page size.
>
> The callers, pud_set_huge() and pmd_set_huge(), check if the given range
> is safe with MTRRs for creating a huge page mapping. If not, they fail
> the request, which leads their callers, ioremap_pud_range() and
> ioremap_pmd_range(), to retry with a smaller page size, i.e. 1GB -> 2MB
> -> 4KB. 4KB may not have overlap with MTRRs (hence no checking is
> necessary), which will succeed as before.

Ok, now *this* should be in the form of a comment over the KVA helpers,
not the MTRR aspect. Callers of those functions would have to know that
- they shouldn't care about MTRR setup.

The MTRR aspect with the 3 conditions should be only over
mtrr_type_lookup().

I'll integrate it into the patch.

--
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--

2015-05-19 13:23:18

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v5 6/6] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping

On Tue, May 19, 2015 at 01:44:37PM +0200, Borislav Petkov wrote:
> > Try with a smaller page size.
> >
> > The callers, pud_set_huge() and pmd_set_huge(), check if the given range
> > is safe with MTRRs for creating a huge page mapping. If not, they fail
> > the request, which leads their callers, ioremap_pud_range() and
> > ioremap_pmd_range(), to retry with a smaller page size, i.e. 1GB -> 2MB
> > -> 4KB. 4KB may not have overlap with MTRRs (hence no checking is
> > necessary), which will succeed as before.

Scratch that, I think I have it now. And I even have a good feeling
about it :-)

---
From: Toshi Kani <[email protected]>
Date: Fri, 15 May 2015 12:23:57 -0600
Subject: [PATCH] x86/mm: Enhance MTRR checks in kernel mapping helpers

This patch adds the argument 'uniform' to mtrr_type_lookup(), which gets
set to 1 when a given range is covered uniformly by MTRRs, i.e. the
range is fully covered by a single MTRR entry or the default type.

Change pud_set_huge() and pmd_set_huge() to honor the 'uniform' flag to
see if it is safe to create a huge page mapping in the range.

This allows them to create a huge page mapping in a range covered by
a single MTRR entry of any memory type. It also detects a non-optimal
request properly. They continue to check with the WB type since it does
not effectively change the uniform mapping even if a request spans
multiple MTRR entries.

pmd_set_huge() logs a warning message to a non-optimal request so that
driver writers will be aware of such a case. Drivers should make a
mapping request aligned to a single MTRR entry when the range is covered
by MTRRs.

Signed-off-by: Toshi Kani <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: Luis R. Rodriguez <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: linux-mm <[email protected]>
Cc: x86-ml <[email protected]>
Cc: lkml <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
[ Realign, flesh out comments, improve warning message. ]
Signed-off-by: Borislav Petkov <[email protected]>
---
arch/x86/include/asm/mtrr.h | 4 ++--
arch/x86/kernel/cpu/mtrr/generic.c | 40 ++++++++++++++++++++++++++++----------
arch/x86/mm/pat.c | 4 ++--
arch/x86/mm/pgtable.c | 38 +++++++++++++++++++++++-------------
4 files changed, 58 insertions(+), 28 deletions(-)

diff --git a/arch/x86/include/asm/mtrr.h b/arch/x86/include/asm/mtrr.h
index bb03a547c1ab..a31759e1edd9 100644
--- a/arch/x86/include/asm/mtrr.h
+++ b/arch/x86/include/asm/mtrr.h
@@ -31,7 +31,7 @@
* arch_phys_wc_add and arch_phys_wc_del.
*/
# ifdef CONFIG_MTRR
-extern u8 mtrr_type_lookup(u64 addr, u64 end);
+extern u8 mtrr_type_lookup(u64 addr, u64 end, u8 *uniform);
extern void mtrr_save_fixed_ranges(void *);
extern void mtrr_save_state(void);
extern int mtrr_add(unsigned long base, unsigned long size,
@@ -50,7 +50,7 @@ extern int mtrr_trim_uncached_memory(unsigned long end_pfn);
extern int amd_special_default_mtrr(void);
extern int phys_wc_to_mtrr_index(int handle);
# else
-static inline u8 mtrr_type_lookup(u64 addr, u64 end)
+static inline u8 mtrr_type_lookup(u64 addr, u64 end, u8 *uniform)
{
/*
* Return no-MTRRs:
diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c
index e51100c49eea..f782d9b62cb3 100644
--- a/arch/x86/kernel/cpu/mtrr/generic.c
+++ b/arch/x86/kernel/cpu/mtrr/generic.c
@@ -147,19 +147,24 @@ static u8 mtrr_type_lookup_fixed(u64 start, u64 end)
* Return Value:
* MTRR_TYPE_(type) - Matched memory type or default memory type (unmatched)
*
- * Output Argument:
+ * Output Arguments:
* repeat - Set to 1 when [start:end] spanned across MTRR range and type
* returned corresponds only to [start:*partial_end]. Caller has
* to lookup again for [*partial_end:end].
+ *
+ * uniform - Set to 1 when an MTRR covers the region uniformly, i.e. the
+ * region is fully covered by a single MTRR entry or the default
+ * type.
*/
static u8 mtrr_type_lookup_variable(u64 start, u64 end, u64 *partial_end,
- int *repeat)
+ int *repeat, u8 *uniform)
{
int i;
u64 base, mask;
u8 prev_match, curr_match;

*repeat = 0;
+ *uniform = 1;

/* Make end inclusive instead of exclusive */
end--;
@@ -214,6 +219,7 @@ static u8 mtrr_type_lookup_variable(u64 start, u64 end, u64 *partial_end,

end = *partial_end - 1; /* end is inclusive */
*repeat = 1;
+ *uniform = 0;
}

if ((start & mask) != (base & mask))
@@ -225,6 +231,7 @@ static u8 mtrr_type_lookup_variable(u64 start, u64 end, u64 *partial_end,
continue;
}

+ *uniform = 0;
if (check_type_overlap(&prev_match, &curr_match))
return curr_match;
}
@@ -241,10 +248,15 @@ static u8 mtrr_type_lookup_variable(u64 start, u64 end, u64 *partial_end,
* Return Values:
* MTRR_TYPE_(type) - The effective MTRR type for the region
* MTRR_TYPE_INVALID - MTRR is disabled
+ *
+ * Output Argument:
+ * uniform - Set to 1 when an MTRR covers the region uniformly, i.e. the
+ * region is fully covered by a single MTRR entry or the default
+ * type.
*/
-u8 mtrr_type_lookup(u64 start, u64 end)
+u8 mtrr_type_lookup(u64 start, u64 end, u8 *uniform)
{
- u8 type, prev_type;
+ u8 type, prev_type, is_uniform = 1, dummy;
int repeat;
u64 partial_end;

@@ -260,14 +272,18 @@ u8 mtrr_type_lookup(u64 start, u64 end)
*/
if ((start < 0x100000) &&
(mtrr_state.have_fixed) &&
- (mtrr_state.enabled & MTRR_STATE_MTRR_FIXED_ENABLED))
- return mtrr_type_lookup_fixed(start, end);
+ (mtrr_state.enabled & MTRR_STATE_MTRR_FIXED_ENABLED)) {
+ is_uniform = 0;
+ type = mtrr_type_lookup_fixed(start, end);
+ goto out;
+ }

/*
* Look up the variable ranges. Look of multiple ranges matching
* this address and pick type as per MTRR precedence.
*/
- type = mtrr_type_lookup_variable(start, end, &partial_end, &repeat);
+ type = mtrr_type_lookup_variable(start, end, &partial_end,
+ &repeat, &is_uniform);

/*
* Common path is with repeat = 0.
@@ -278,15 +294,19 @@ u8 mtrr_type_lookup(u64 start, u64 end)
while (repeat) {
prev_type = type;
start = partial_end;
- type = mtrr_type_lookup_variable(start, end, &partial_end, &repeat);
+ is_uniform = 0;
+ type = mtrr_type_lookup_variable(start, end, &partial_end,
+ &repeat, &dummy);

if (check_type_overlap(&prev_type, &type))
- return type;
+ goto out;
}

if (mtrr_tom2 && (start >= (1ULL<<32)) && (end < mtrr_tom2))
- return MTRR_TYPE_WRBACK;
+ type = MTRR_TYPE_WRBACK;

+out:
+ *uniform = is_uniform;
return type;
}

diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index 35af6771a95a..372ad422c2c3 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -267,9 +267,9 @@ static unsigned long pat_x_mtrr_type(u64 start, u64 end,
* request is for WB.
*/
if (req_type == _PAGE_CACHE_MODE_WB) {
- u8 mtrr_type;
+ u8 mtrr_type, uniform;

- mtrr_type = mtrr_type_lookup(start, end);
+ mtrr_type = mtrr_type_lookup(start, end, &uniform);
if (mtrr_type != MTRR_TYPE_WRBACK)
return _PAGE_CACHE_MODE_UC_MINUS;

diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
index c30f9819786b..df2f8a587438 100644
--- a/arch/x86/mm/pgtable.c
+++ b/arch/x86/mm/pgtable.c
@@ -566,19 +566,28 @@ void native_set_fixmap(enum fixed_addresses idx, phys_addr_t phys,
/**
* pud_set_huge - setup kernel PUD mapping
*
- * MTRR can override PAT memory types with 4KiB granularity. Therefore,
- * this function does not set up a huge page when the range is covered
- * by a non-WB type of MTRR. MTRR_TYPE_INVALID indicates that MTRR are
- * disabled.
+ * MTRRs can override PAT memory types with 4KiB granularity. Therefore, this
+ * function sets up a huge page only if any of the following conditions are met:
+ *
+ * - MTRRs are disabled, or
+ *
+ * - MTRRs are enabled and the range is completely covered by a single MTRR, or
+ *
+ * - MTRRs are enabled and the range is not completely covered by a single MTRR
+ * but the memory type of the range is WB, even if covered by multiple MTRRs.
+ *
+ * Callers should try to decrease page size (1GB -> 2MB -> 4K) if the bigger
+ * page mapping attempt fails.
*
* Returns 1 on success and 0 on failure.
*/
int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot)
{
- u8 mtrr;
+ u8 mtrr, uniform;

- mtrr = mtrr_type_lookup(addr, addr + PUD_SIZE);
- if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != MTRR_TYPE_INVALID))
+ mtrr = mtrr_type_lookup(addr, addr + PUD_SIZE, &uniform);
+ if ((mtrr != MTRR_TYPE_INVALID) && (!uniform) &&
+ (mtrr != MTRR_TYPE_WRBACK))
return 0;

prot = pgprot_4k_2_large(prot);
@@ -593,20 +602,21 @@ int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot)
/**
* pmd_set_huge - setup kernel PMD mapping
*
- * MTRR can override PAT memory types with 4KiB granularity. Therefore,
- * this function does not set up a huge page when the range is covered
- * by a non-WB type of MTRR. MTRR_TYPE_INVALID indicates that MTRR are
- * disabled.
+ * See text over pud_set_huge() above.
*
* Returns 1 on success and 0 on failure.
*/
int pmd_set_huge(pmd_t *pmd, phys_addr_t addr, pgprot_t prot)
{
- u8 mtrr;
+ u8 mtrr, uniform;

- mtrr = mtrr_type_lookup(addr, addr + PMD_SIZE);
- if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != MTRR_TYPE_INVALID))
+ mtrr = mtrr_type_lookup(addr, addr + PMD_SIZE, &uniform);
+ if ((mtrr != MTRR_TYPE_INVALID) && (!uniform) &&
+ (mtrr != MTRR_TYPE_WRBACK)) {
+ pr_warn_once("%s: Cannot satisfy [mem %#010llx-%#010llx] with a huge-page mapping due to MTRR override.\n",
+ __func__, addr, addr + PMD_SIZE);
return 0;
+ }

prot = pgprot_4k_2_large(prot);

--
2.3.5

--
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--

2015-05-19 14:06:34

by Toshi Kani

[permalink] [raw]
Subject: Re: [PATCH v5 6/6] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping

On Tue, 2015-05-19 at 15:23 +0200, Borislav Petkov wrote:
> On Tue, May 19, 2015 at 01:44:37PM +0200, Borislav Petkov wrote:
> > > Try with a smaller page size.
> > >
> > > The callers, pud_set_huge() and pmd_set_huge(), check if the given range
> > > is safe with MTRRs for creating a huge page mapping. If not, they fail
> > > the request, which leads their callers, ioremap_pud_range() and
> > > ioremap_pmd_range(), to retry with a smaller page size, i.e. 1GB -> 2MB
> > > -> 4KB. 4KB may not have overlap with MTRRs (hence no checking is
> > > necessary), which will succeed as before.
>
> Scratch that, I think I have it now. And I even have a good feeling
> about it :-)

Looks good. Thanks for the update!
-Toshi

2015-05-20 11:55:17

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH v5 6/6] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping


* Borislav Petkov <[email protected]> wrote:

> --- a/arch/x86/mm/pgtable.c
> +++ b/arch/x86/mm/pgtable.c
> @@ -566,19 +566,28 @@ void native_set_fixmap(enum fixed_addresses idx, phys_addr_t phys,
> /**
> * pud_set_huge - setup kernel PUD mapping
> *
> - * MTRR can override PAT memory types with 4KiB granularity. Therefore,
> - * this function does not set up a huge page when the range is covered
> - * by a non-WB type of MTRR. MTRR_TYPE_INVALID indicates that MTRR are
> - * disabled.
> + * MTRRs can override PAT memory types with 4KiB granularity. Therefore, this
> + * function sets up a huge page only if any of the following conditions are met:
> + *
> + * - MTRRs are disabled, or
> + *
> + * - MTRRs are enabled and the range is completely covered by a single MTRR, or
> + *
> + * - MTRRs are enabled and the range is not completely covered by a single MTRR
> + * but the memory type of the range is WB, even if covered by multiple MTRRs.
> + *
> + * Callers should try to decrease page size (1GB -> 2MB -> 4K) if the bigger
> + * page mapping attempt fails.

This comment should explain why it's ok in the WB case.

Also, the phrase 'the memory type of the range' is ambiguous: it might
mean the partial MTRR's, or the memory type specified via PAT by the
huge-pmd entry.

Thanks,

Ingo

2015-05-20 14:53:41

by Toshi Kani

[permalink] [raw]
Subject: Re: [PATCH v5 6/6] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping

On Wed, 2015-05-20 at 13:55 +0200, Ingo Molnar wrote:
> * Borislav Petkov <[email protected]> wrote:
>
> > --- a/arch/x86/mm/pgtable.c
> > +++ b/arch/x86/mm/pgtable.c
> > @@ -566,19 +566,28 @@ void native_set_fixmap(enum fixed_addresses idx, phys_addr_t phys,
> > /**
> > * pud_set_huge - setup kernel PUD mapping
> > *
> > - * MTRR can override PAT memory types with 4KiB granularity. Therefore,
> > - * this function does not set up a huge page when the range is covered
> > - * by a non-WB type of MTRR. MTRR_TYPE_INVALID indicates that MTRR are
> > - * disabled.
> > + * MTRRs can override PAT memory types with 4KiB granularity. Therefore, this
> > + * function sets up a huge page only if any of the following conditions are met:
> > + *
> > + * - MTRRs are disabled, or
> > + *
> > + * - MTRRs are enabled and the range is completely covered by a single MTRR, or
> > + *
> > + * - MTRRs are enabled and the range is not completely covered by a single MTRR
> > + * but the memory type of the range is WB, even if covered by multiple MTRRs.
> > + *
> > + * Callers should try to decrease page size (1GB -> 2MB -> 4K) if the bigger
> > + * page mapping attempt fails.
>
> This comment should explain why it's ok in the WB case.
>
> Also, the phrase 'the memory type of the range' is ambiguous: it might
> mean the partial MTRR's, or the memory type specified via PAT by the
> huge-pmd entry.

Agreed. How about this sentence?

- MTRRs are enabled and the corresponding MTRR memory type is WB, which
has no effect to the requested PAT memory type.

Thanks,
-Toshi

2015-05-20 15:01:23

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH v5 6/6] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping


* Toshi Kani <[email protected]> wrote:

> On Wed, 2015-05-20 at 13:55 +0200, Ingo Molnar wrote:
> > * Borislav Petkov <[email protected]> wrote:
> >
> > > --- a/arch/x86/mm/pgtable.c
> > > +++ b/arch/x86/mm/pgtable.c
> > > @@ -566,19 +566,28 @@ void native_set_fixmap(enum fixed_addresses idx, phys_addr_t phys,
> > > /**
> > > * pud_set_huge - setup kernel PUD mapping
> > > *
> > > - * MTRR can override PAT memory types with 4KiB granularity. Therefore,
> > > - * this function does not set up a huge page when the range is covered
> > > - * by a non-WB type of MTRR. MTRR_TYPE_INVALID indicates that MTRR are
> > > - * disabled.
> > > + * MTRRs can override PAT memory types with 4KiB granularity. Therefore, this
> > > + * function sets up a huge page only if any of the following conditions are met:
> > > + *
> > > + * - MTRRs are disabled, or
> > > + *
> > > + * - MTRRs are enabled and the range is completely covered by a single MTRR, or
> > > + *
> > > + * - MTRRs are enabled and the range is not completely covered by a single MTRR
> > > + * but the memory type of the range is WB, even if covered by multiple MTRRs.
> > > + *
> > > + * Callers should try to decrease page size (1GB -> 2MB -> 4K) if the bigger
> > > + * page mapping attempt fails.
> >
> > This comment should explain why it's ok in the WB case.
> >
> > Also, the phrase 'the memory type of the range' is ambiguous: it might
> > mean the partial MTRR's, or the memory type specified via PAT by the
> > huge-pmd entry.
>
> Agreed. How about this sentence?
>
> - MTRRs are enabled and the corresponding MTRR memory type is WB, which
> has no effect to the requested PAT memory type.

s/effect to/effect on

sounds good otherwise!

Btw., if WB MTRR entries can never have an effect on Linux PAT
specified attributes, why do we allow them to be created? I don't
think we ever call into real mode for this to matter?

Thanks,

Ingo

2015-05-20 15:21:50

by Toshi Kani

[permalink] [raw]
Subject: Re: [PATCH v5 6/6] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping

On Wed, 2015-05-20 at 17:01 +0200, Ingo Molnar wrote:
> * Toshi Kani <[email protected]> wrote:
>
> > On Wed, 2015-05-20 at 13:55 +0200, Ingo Molnar wrote:
> > > * Borislav Petkov <[email protected]> wrote:
> > >
> > > > --- a/arch/x86/mm/pgtable.c
> > > > +++ b/arch/x86/mm/pgtable.c
> > > > @@ -566,19 +566,28 @@ void native_set_fixmap(enum fixed_addresses idx, phys_addr_t phys,
> > > > /**
> > > > * pud_set_huge - setup kernel PUD mapping
> > > > *
> > > > - * MTRR can override PAT memory types with 4KiB granularity. Therefore,
> > > > - * this function does not set up a huge page when the range is covered
> > > > - * by a non-WB type of MTRR. MTRR_TYPE_INVALID indicates that MTRR are
> > > > - * disabled.
> > > > + * MTRRs can override PAT memory types with 4KiB granularity. Therefore, this
> > > > + * function sets up a huge page only if any of the following conditions are met:
> > > > + *
> > > > + * - MTRRs are disabled, or
> > > > + *
> > > > + * - MTRRs are enabled and the range is completely covered by a single MTRR, or
> > > > + *
> > > > + * - MTRRs are enabled and the range is not completely covered by a single MTRR
> > > > + * but the memory type of the range is WB, even if covered by multiple MTRRs.
> > > > + *
> > > > + * Callers should try to decrease page size (1GB -> 2MB -> 4K) if the bigger
> > > > + * page mapping attempt fails.
> > >
> > > This comment should explain why it's ok in the WB case.
> > >
> > > Also, the phrase 'the memory type of the range' is ambiguous: it might
> > > mean the partial MTRR's, or the memory type specified via PAT by the
> > > huge-pmd entry.
> >
> > Agreed. How about this sentence?
> >
> > - MTRRs are enabled and the corresponding MTRR memory type is WB, which
> > has no effect to the requested PAT memory type.
>
> s/effect to/effect on
>
> sounds good otherwise!

Great!

Boris, can you update the patch, or do you want me to send you a patch
for this update?

> Btw., if WB MTRR entries can never have an effect on Linux PAT
> specified attributes, why do we allow them to be created? I don't
> think we ever call into real mode for this to matter?

MTRRs have the default memory type, which is used when the given range
is not covered by any MTRR entries. There are two types of BIOS setup:

1) Default UC
- BIOS sets the default type to UC, and covers all WB accessible ranges
with MTRR entries of WB.

2) Default WB
- BIOS sets the default type to WB, and covers non-WB accessible range
with MTRR entries of other memory types, such as UC.

In both cases, WB type can be returned. In case of 1), the requested
range may overlap with multiple MTRR entries of WB type, which is still
safe.

Thanks,
-Toshi


Thanks,
-Toshi


2015-05-20 16:06:14

by Toshi Kani

[permalink] [raw]
Subject: Re: [PATCH v5 6/6] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping

On Wed, 2015-05-20 at 18:04 +0200, Borislav Petkov wrote:
> On Wed, May 20, 2015 at 09:02:23AM -0600, Toshi Kani wrote:
> > Boris, can you update the patch,
>
> Done.

Thanks!
-Toshi

2015-05-20 16:04:24

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v5 6/6] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping

On Wed, May 20, 2015 at 09:02:23AM -0600, Toshi Kani wrote:
> Boris, can you update the patch,

Done.

--
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--