LinuxLists.cc - [PATCH v4 00/16] x86: make pat and mtrr independent from each other

2022-10-04 09:28:58

Subject: [PATCH v4 00/16] x86: make pat and mtrr independent from each other

Today PAT can't be used without MTRR being available, unless MTRR is at
least configured via CONFIG_MTRR and the system is running as Xen PV
guest. In this case PAT is automatically available via the hypervisor,
but the PAT MSR can't be modified by the kernel and MTRR is disabled.

The same applies to a kernel built with no MTRR support: it won't
allow to use the PAT MSR, even if there is no technical reason for
that, other than setting up PAT on all cpus the same way (which is a
requirement of the processor's cache management) is relying on some
MTRR specific code.

Fix all of that by:

- moving the function needed by PAT from MTRR specific code one level
up
- reworking the init sequences of MTRR and PAT to be more similar to
each other without calling PAT from MTRR code
- removing the dependency of PAT on MTRR

While working on that I discovered two minor bugs in MTRR code, which
are fixed, too.

There is some more cleanup done reducing code size.

Changes in V4:
- new patches 10, 14, 15, 16
- split up old patch 4 into 3 patches
- addressed comments

Changes in V3:
- replace patch 1 by just adding a comment

Changes in V2:
- complete rework of the patches based on comments by Boris Petkov
- added several patches to the series

Juergen Gross (16):
x86/mtrr: add comment for set_mtrr_state() serialization
x86/mtrr: remove unused cyrix_set_all() function
x86/mtrr: replace use_intel() with a local flag
x86/mtrr: rename prepare_set() and post_set()
x86/mtrr: split MTRR specific handling from cache dis/enabling
x86: move some code out of arch/x86/kernel/cpu/mtrr
x86/mtrr: split generic_set_all()
x86/mtrr: remove set_all callback from struct mtrr_ops
x86/mtrr: simplify mtrr_bp_init()
x86/mtrr: get rid of mtrr_enabled bool
x86/mtrr: let cache_aps_delayed_init replace mtrr_aps_delayed_init
x86/mtrr: add a stop_machine() handler calling only cache_cpu_init()
x86: decouple pat and mtrr handling
x86: switch cache_ap_init() to hotplug callback
x86: do MTRR/PAT setup on all secondary CPUs in parallel
x86/mtrr: simplify mtrr_ops initialization

arch/x86/include/asm/cacheinfo.h | 18 +++
arch/x86/include/asm/memtype.h | 5 +-
arch/x86/include/asm/mtrr.h | 13 +-
arch/x86/kernel/cpu/cacheinfo.c | 168 ++++++++++++++++++++++++++
arch/x86/kernel/cpu/common.c | 2 +-
arch/x86/kernel/cpu/mtrr/amd.c | 8 +-
arch/x86/kernel/cpu/mtrr/centaur.c | 8 +-
arch/x86/kernel/cpu/mtrr/cyrix.c | 42 +------
arch/x86/kernel/cpu/mtrr/generic.c | 127 +++-----------------
arch/x86/kernel/cpu/mtrr/mtrr.c | 185 ++++-------------------------
arch/x86/kernel/cpu/mtrr/mtrr.h | 20 ++--
arch/x86/kernel/setup.c | 14 +--
arch/x86/kernel/smpboot.c | 9 +-
arch/x86/mm/pat/memtype.c | 127 ++++++--------------
arch/x86/power/cpu.c | 3 +-
include/linux/cpuhotplug.h | 1 +
16 files changed, 298 insertions(+), 452 deletions(-)

--
2.35.3

2022-10-04 09:30:24

by Jürgen Groß

[permalink] [raw]

Subject: [PATCH v4 06/16] x86: move some code out of arch/x86/kernel/cpu/mtrr

Prepare making PAT and MTRR support independent from each other by
moving some code needed by both out of the MTRR specific sources.

Signed-off-by: Juergen Gross <[email protected]>
---
V2:
- move code from cpu/common.c to cpu/cacheinfo.c (Borislav Petkov)
V4:
- carved out all non-movement modifications (Borislav Petkov)
---
arch/x86/kernel/cpu/cacheinfo.c | 77 ++++++++++++++++++++++++++++++
arch/x86/kernel/cpu/mtrr/generic.c | 74 ----------------------------
2 files changed, 77 insertions(+), 74 deletions(-)

diff --git a/arch/x86/kernel/cpu/cacheinfo.c b/arch/x86/kernel/cpu/cacheinfo.c
index 5228fb9a3798..ff32b2b1ca23 100644
--- a/arch/x86/kernel/cpu/cacheinfo.c
+++ b/arch/x86/kernel/cpu/cacheinfo.c
@@ -20,6 +20,8 @@
#include <asm/cacheinfo.h>
#include <asm/amd_nb.h>
#include <asm/smp.h>
+#include <asm/mtrr.h>
+#include <asm/tlbflush.h>

#include "cpu.h"

@@ -1043,3 +1045,78 @@ int populate_cache_leaves(unsigned int cpu)

return 0;
}
+
+/*
+ * Disable and enable caches. Needed for changing MTRRs and the PAT MSR.
+ *
+ * Since we are disabling the cache don't allow any interrupts,
+ * they would run extremely slow and would only increase the pain.
+ *
+ * The caller must ensure that local interrupts are disabled and
+ * are reenabled after cache_enable() has been called.
+ */
+static unsigned long saved_cr4;
+static DEFINE_RAW_SPINLOCK(cache_disable_lock);
+
+void cache_disable(void) __acquires(cache_disable_lock)
+{
+ unsigned long cr0;
+
+ /*
+ * Note that this is not ideal
+ * since the cache is only flushed/disabled for this CPU while the
+ * MTRRs are changed, but changing this requires more invasive
+ * changes to the way the kernel boots
+ */
+
+ raw_spin_lock(&cache_disable_lock);
+
+ /* Enter the no-fill (CD=1, NW=0) cache mode and flush caches. */
+ cr0 = read_cr0() | X86_CR0_CD;
+ write_cr0(cr0);
+
+ /*
+ * Cache flushing is the most time-consuming step when programming
+ * the MTRRs. Fortunately, as per the Intel Software Development
+ * Manual, we can skip it if the processor supports cache self-
+ * snooping.
+ */
+ if (!static_cpu_has(X86_FEATURE_SELFSNOOP))
+ wbinvd();
+
+ /* Save value of CR4 and clear Page Global Enable (bit 7) */
+ if (boot_cpu_has(X86_FEATURE_PGE)) {
+ saved_cr4 = __read_cr4();
+ __write_cr4(saved_cr4 & ~X86_CR4_PGE);
+ }
+
+ /* Flush all TLBs via a mov %cr3, %reg; mov %reg, %cr3 */
+ count_vm_tlb_event(NR_TLB_LOCAL_FLUSH_ALL);
+ flush_tlb_local();
+
+ if (boot_cpu_has(X86_FEATURE_MTRR))
+ mtrr_disable();
+
+ /* Again, only flush caches if we have to. */
+ if (!static_cpu_has(X86_FEATURE_SELFSNOOP))
+ wbinvd();
+}
+
+void cache_enable(void) __releases(cache_disable_lock)
+{
+ /* Flush TLBs (no need to flush caches - they are disabled) */
+ count_vm_tlb_event(NR_TLB_LOCAL_FLUSH_ALL);
+ flush_tlb_local();
+
+ if (boot_cpu_has(X86_FEATURE_MTRR))
+ mtrr_enable();
+
+ /* Enable caches */
+ write_cr0(read_cr0() & ~X86_CR0_CD);
+
+ /* Restore value of CR4 */
+ if (boot_cpu_has(X86_FEATURE_PGE))
+ __write_cr4(saved_cr4);
+
+ raw_spin_unlock(&cache_disable_lock);
+}
diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c
index 164d753e9867..bfe13eedaca8 100644
--- a/arch/x86/kernel/cpu/mtrr/generic.c
+++ b/arch/x86/kernel/cpu/mtrr/generic.c
@@ -731,80 +731,6 @@ void mtrr_enable(void)
mtrr_wrmsr(MSR_MTRRdefType, deftype_lo, deftype_hi);
}

-/*
- * Disable and enable caches. Needed for changing MTRRs and the PAT MSR.
- *
- * Since we are disabling the cache don't allow any interrupts,
- * they would run extremely slow and would only increase the pain.
- *
- * The caller must ensure that local interrupts are disabled and
- * are reenabled after cache_enable() has been called.
- */
-static unsigned long saved_cr4;
-static DEFINE_RAW_SPINLOCK(cache_disable_lock);
-
-void cache_disable(void) __acquires(cache_disable_lock)
-{
- unsigned long cr0;
-
- /*
- * Note that this is not ideal
- * since the cache is only flushed/disabled for this CPU while the
- * MTRRs are changed, but changing this requires more invasive
- * changes to the way the kernel boots
- */
-
- raw_spin_lock(&cache_disable_lock);
-
- /* Enter the no-fill (CD=1, NW=0) cache mode and flush caches. */
- cr0 = read_cr0() | X86_CR0_CD;
- write_cr0(cr0);
-
- /*
- * Cache flushing is the most time-consuming step when programming
- * the MTRRs. Fortunately, as per the Intel Software Development
- * Manual, we can skip it if the processor supports cache self-
- * snooping.
- */
- if (!static_cpu_has(X86_FEATURE_SELFSNOOP))
- wbinvd();
-
- /* Save value of CR4 and clear Page Global Enable (bit 7) */
- if (boot_cpu_has(X86_FEATURE_PGE)) {
- saved_cr4 = __read_cr4();
- __write_cr4(saved_cr4 & ~X86_CR4_PGE);
- }
-
- /* Flush all TLBs via a mov %cr3, %reg; mov %reg, %cr3 */
- count_vm_tlb_event(NR_TLB_LOCAL_FLUSH_ALL);
- flush_tlb_local();
-
- if (boot_cpu_has(X86_FEATURE_MTRR))
- mtrr_disable();
-
- /* Again, only flush caches if we have to. */
- if (!static_cpu_has(X86_FEATURE_SELFSNOOP))
- wbinvd();
-}
-
-void cache_enable(void) __releases(cache_disable_lock)
-{
- /* Flush TLBs (no need to flush caches - they are disabled) */
- count_vm_tlb_event(NR_TLB_LOCAL_FLUSH_ALL);
- flush_tlb_local();
-
- if (boot_cpu_has(X86_FEATURE_MTRR))
- mtrr_enable();
-
- /* Enable caches */
- write_cr0(read_cr0() & ~X86_CR0_CD);
-
- /* Restore value of CR4 */
- if (boot_cpu_has(X86_FEATURE_PGE))
- __write_cr4(saved_cr4);
- raw_spin_unlock(&cache_disable_lock);
-}
-
static void generic_set_all(void)
{
unsigned long mask, count;
--
2.35.3

2022-10-19 07:42:11

by Jürgen Groß

[permalink] [raw]

Subject: Re: [PATCH v4 00/16] x86: make pat and mtrr independent from each other

Ping?

On 04.10.22 10:10, Juergen Gross wrote:
> Today PAT can't be used without MTRR being available, unless MTRR is at
> least configured via CONFIG_MTRR and the system is running as Xen PV
> guest. In this case PAT is automatically available via the hypervisor,
> but the PAT MSR can't be modified by the kernel and MTRR is disabled.
>
> The same applies to a kernel built with no MTRR support: it won't
> allow to use the PAT MSR, even if there is no technical reason for
> that, other than setting up PAT on all cpus the same way (which is a
> requirement of the processor's cache management) is relying on some
> MTRR specific code.
>
> Fix all of that by:
>
> - moving the function needed by PAT from MTRR specific code one level
> up
> - reworking the init sequences of MTRR and PAT to be more similar to
> each other without calling PAT from MTRR code
> - removing the dependency of PAT on MTRR
>
> While working on that I discovered two minor bugs in MTRR code, which
> are fixed, too.
>
> There is some more cleanup done reducing code size.
>
> Changes in V4:
> - new patches 10, 14, 15, 16
> - split up old patch 4 into 3 patches
> - addressed comments
>
> Changes in V3:
> - replace patch 1 by just adding a comment
>
> Changes in V2:
> - complete rework of the patches based on comments by Boris Petkov
> - added several patches to the series
>
> Juergen Gross (16):
> x86/mtrr: add comment for set_mtrr_state() serialization
> x86/mtrr: remove unused cyrix_set_all() function
> x86/mtrr: replace use_intel() with a local flag
> x86/mtrr: rename prepare_set() and post_set()
> x86/mtrr: split MTRR specific handling from cache dis/enabling
> x86: move some code out of arch/x86/kernel/cpu/mtrr
> x86/mtrr: split generic_set_all()
> x86/mtrr: remove set_all callback from struct mtrr_ops
> x86/mtrr: simplify mtrr_bp_init()
> x86/mtrr: get rid of mtrr_enabled bool
> x86/mtrr: let cache_aps_delayed_init replace mtrr_aps_delayed_init
> x86/mtrr: add a stop_machine() handler calling only cache_cpu_init()
> x86: decouple pat and mtrr handling
> x86: switch cache_ap_init() to hotplug callback
> x86: do MTRR/PAT setup on all secondary CPUs in parallel
> x86/mtrr: simplify mtrr_ops initialization
>
> arch/x86/include/asm/cacheinfo.h | 18 +++
> arch/x86/include/asm/memtype.h | 5 +-
> arch/x86/include/asm/mtrr.h | 13 +-
> arch/x86/kernel/cpu/cacheinfo.c | 168 ++++++++++++++++++++++++++
> arch/x86/kernel/cpu/common.c | 2 +-
> arch/x86/kernel/cpu/mtrr/amd.c | 8 +-
> arch/x86/kernel/cpu/mtrr/centaur.c | 8 +-
> arch/x86/kernel/cpu/mtrr/cyrix.c | 42 +------
> arch/x86/kernel/cpu/mtrr/generic.c | 127 +++-----------------
> arch/x86/kernel/cpu/mtrr/mtrr.c | 185 ++++-------------------------
> arch/x86/kernel/cpu/mtrr/mtrr.h | 20 ++--
> arch/x86/kernel/setup.c | 14 +--
> arch/x86/kernel/smpboot.c | 9 +-
> arch/x86/mm/pat/memtype.c | 127 ++++++--------------
> arch/x86/power/cpu.c | 3 +-
> include/linux/cpuhotplug.h | 1 +
> 16 files changed, 298 insertions(+), 452 deletions(-)
>

Attachments:

OpenPGP_0xB0DE9DD628BF132F.asc (3.08 kB)
OpenPGP public key OpenPGP_signature (505.00 B)
OpenPGP digital signature Download all attachments