A Xorg failure on qemu32 was reported as a regression caused
by 'commit 9cd25aac1f44 ("x86/mm/pat: Emulate PAT when it is
disabled")'. [1] This patch-set fixes the regression.
Negative effects of this regression were two failures in Xorg
on qemu32 env, which were triggered by the fact that its virtual
CPU does not support MTRR. [2]
#1. copy_process() failed in the check in reserve_pfn_range()
#2. error path in copy_process() then hit WARN_ON_ONCE in
untrack_pfn().
These negative effects are caused by two separate bugs, but they
can be dealt in lower priority when PAT is properly initialized.
This patch-set fixes a long-standing PAT initialization issue.
Please see the changelog in patch 3/6 for the details of the
issue.
- Patch 1-2 makes necessary enhancement to PAT for the fix without
breaking Xen.
- Patch 3 fixes the regression.
- Patch 4 fixes an MTRR issue related with PAT init.
- Patch 5 removes PAT init code from Xen.
- Patch 6 adds PAT init to documentation.
[1]: https://lkml.org/lkml/2016/3/3/828
[2]: https://lkml.org/lkml/2016/3/4/775
I'd appreciate if someone can test this patch-set on Xen to make
sure that there is no change in "x86/PAT: Configuration [0-7] .."
message in dmesg.
---
v2:
- Divide patch-set into a single change. (Borislav Petkov)
- Xen's case must be handled properly. (Luis R. Rodriguez)
- Change changelog and title to describe the issue. (Ingo Molnar)
- Update an error message. (Robert Elliott, Borislav Petkov)
---
Toshi Kani (6):
1/6 x86/mm/pat: Change PAT to support non-default PAT MSR
2/6 x86/mm/pat: Add pat_disable() interface
3/6 x86/mtrr: Fix Xorg crashes in Qemu sessions
4/6 x86/mtrr: Fix PAT init handling when MTRR MSR is disabled
5/6 x86/xen,pat: Remove PAT table init code from Xen
6/6 x86/pat: Document PAT initializations
---
Documentation/x86/pat.txt | 32 ++++++++++++
arch/x86/include/asm/mtrr.h | 6 ++-
arch/x86/include/asm/pat.h | 2 +-
arch/x86/kernel/cpu/mtrr/generic.c | 24 +++++----
arch/x86/kernel/cpu/mtrr/main.c | 13 ++++-
arch/x86/kernel/cpu/mtrr/mtrr.h | 1 +
arch/x86/mm/pat.c | 103 +++++++++++++++++++++++++++++--------
arch/x86/xen/enlighten.c | 9 ----
8 files changed, 146 insertions(+), 44 deletions(-)
In preparation to fix a regression caused by 'commit 9cd25aac1f44
("x86/mm/pat: Emulate PAT when it is disabled")', PAT needs to
support a case that PAT MSR is initialized with a non-default
value.
When pat_init() is called in PAT disable state, it initializes
PAT table with the BIOS default value. Xen, however, sets PAT MSR
with a non-default value to enable WC. This causes inconsistency
between PAT table and PAT MSR when PAT is set to disable on Xen.
Change pat_init() to handle the PAT disable cases properly. Add
pat_keep_handoff_state() to handle two cases when PAT is set to
disable.
1. CPU supports PAT: Set PAT table to be consistent with PAT MSR.
2. CPU does not support PAT: Set PAT table to be consistent with
PWT and PCD bits in a PTE.
Signed-off-by: Toshi Kani <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Luis R. Rodriguez <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Thomas Gleixner <[email protected]>
---
arch/x86/mm/pat.c | 80 +++++++++++++++++++++++++++++++++++++++++------------
1 file changed, 62 insertions(+), 18 deletions(-)
diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index 04e2e71..e0a34b0 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -207,9 +207,6 @@ static void pat_bsp_init(u64 pat)
return;
}
- if (!pat_enabled())
- goto done;
-
rdmsrl(MSR_IA32_CR_PAT, tmp_pat);
if (!tmp_pat) {
pat_disable("PAT MSR is 0, disabled.");
@@ -218,15 +215,11 @@ static void pat_bsp_init(u64 pat)
wrmsrl(MSR_IA32_CR_PAT, pat);
-done:
pat_init_cache_modes(pat);
}
static void pat_ap_init(u64 pat)
{
- if (!pat_enabled())
- return;
-
if (!cpu_has_pat) {
/*
* If this happens we are on a secondary CPU, but switched to
@@ -238,18 +231,43 @@ static void pat_ap_init(u64 pat)
wrmsrl(MSR_IA32_CR_PAT, pat);
}
-void pat_init(void)
+/**
+ * pat_keep_handoff_state - Set PAT table to the handoff state
+ *
+ * This function keeps PAT in the BIOS handoff state. When CPU supports
+ * PAT, it sets PAT table to be consistent with PAT MSR. When CPU does not
+ * support PAT, it emulates PAT by setting PAT table consistent with PWT
+ * and PCD bits in a PTE.
+ *
+ * The PAT table is global to all CPUs, which is initialized once at
+ * boot-time. Any subsequent calls to this function have no effect.
+ */
+static void pat_keep_handoff_state(void)
{
- u64 pat;
- struct cpuinfo_x86 *c = &boot_cpu_data;
+ u64 pat = 0;
+ static int set_handoff_done;
- if (!pat_enabled()) {
+ if (set_handoff_done)
+ return;
+
+ if (boot_cpu_has(X86_FEATURE_PAT)) {
+ /*
+ * CPU supports PAT. Set PAT table to be consistent with
+ * PAT MSR. This case supports "nopat" boot option, and
+ * virtual machine environments which support PAT without
+ * MTRR. In specific, Xen has unique setup to PAT MSR.
+ *
+ * If PAT MSR returns 0, it is considered invalid and emulates
+ * as No PAT.
+ */
+ rdmsrl(MSR_IA32_CR_PAT, pat);
+ }
+
+ if (!pat) {
/*
* No PAT. Emulate the PAT table that corresponds to the two
- * cache bits, PWT (Write Through) and PCD (Cache Disable). This
- * setup is the same as the BIOS default setup when the system
- * has PAT but the "nopat" boot option has been specified. This
- * emulated PAT table is used when MSR_IA32_CR_PAT returns 0.
+ * cache bits, PWT (Write Through) and PCD (Cache Disable).
+ * This setup is also the same as the BIOS default setup.
*
* PTE encoding:
*
@@ -266,10 +284,36 @@ void pat_init(void)
*/
pat = PAT(0, WB) | PAT(1, WT) | PAT(2, UC_MINUS) | PAT(3, UC) |
PAT(4, WB) | PAT(5, WT) | PAT(6, UC_MINUS) | PAT(7, UC);
+ }
+
+ pat_init_cache_modes(pat);
+
+ set_handoff_done = 1;
+}
+
+/**
+ * pat_init - Initialize PAT MSR and PAT table
+ *
+ * This function initializes PAT MSR and PAT table with an OS-defined value
+ * to enable additional cache attributes, WC and WT.
+ *
+ * This function must be called on all CPUs with the specific sequence of
+ * operations defined in Intel SDM. mtrr_rendezvous_handler() provides this
+ * procedure for PAT.
+ */
+void pat_init(void)
+{
+ u64 pat;
+ struct cpuinfo_x86 *c = &boot_cpu_data;
+
+ if (!pat_enabled()) {
+ pat_keep_handoff_state();
+ return;
+ }
- } else if ((c->x86_vendor == X86_VENDOR_INTEL) &&
- (((c->x86 == 0x6) && (c->x86_model <= 0xd)) ||
- ((c->x86 == 0xf) && (c->x86_model <= 0x6)))) {
+ if ((c->x86_vendor == X86_VENDOR_INTEL) &&
+ (((c->x86 == 0x6) && (c->x86_model <= 0xd)) ||
+ ((c->x86 == 0xf) && (c->x86_model <= 0x6)))) {
/*
* PAT support with the lower four entries. Intel Pentium 2,
* 3, M, and 4 are affected by PAT errata, which makes the
$Subject is misleading - there's no non-default PAT MSR - the setting is
non-default.
On Wed, Mar 16, 2016 at 06:44:57PM -0600, Toshi Kani wrote:
> In preparation to fix a regression caused by 'commit 9cd25aac1f44
> ("x86/mm/pat: Emulate PAT when it is disabled")', PAT needs to
> support a case that PAT MSR is initialized with a non-default
> value.
>
> When pat_init() is called in PAT disable state, it initializes
is called and PAT is disabled
> PAT table with the BIOS default value. Xen, however, sets PAT MSR
> with a non-default value to enable WC. This causes inconsistency
> between PAT table and PAT MSR when PAT is set to disable on Xen.
>
> Change pat_init() to handle the PAT disable cases properly. Add
> pat_keep_handoff_state() to handle two cases when PAT is set to
> disable.
> 1. CPU supports PAT: Set PAT table to be consistent with PAT MSR.
> 2. CPU does not support PAT: Set PAT table to be consistent with
> PWT and PCD bits in a PTE.
>
> Signed-off-by: Toshi Kani <[email protected]>
> Cc: Borislav Petkov <[email protected]>
> Cc: Luis R. Rodriguez <[email protected]>
> Cc: Juergen Gross <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: H. Peter Anvin <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> ---
> arch/x86/mm/pat.c | 80 +++++++++++++++++++++++++++++++++++++++++------------
> 1 file changed, 62 insertions(+), 18 deletions(-)
>
> diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
> index 04e2e71..e0a34b0 100644
> --- a/arch/x86/mm/pat.c
> +++ b/arch/x86/mm/pat.c
> @@ -207,9 +207,6 @@ static void pat_bsp_init(u64 pat)
> return;
> }
>
> - if (!pat_enabled())
> - goto done;
> -
> rdmsrl(MSR_IA32_CR_PAT, tmp_pat);
> if (!tmp_pat) {
> pat_disable("PAT MSR is 0, disabled.");
> @@ -218,15 +215,11 @@ static void pat_bsp_init(u64 pat)
>
> wrmsrl(MSR_IA32_CR_PAT, pat);
>
> -done:
> pat_init_cache_modes(pat);
> }
>
> static void pat_ap_init(u64 pat)
> {
> - if (!pat_enabled())
> - return;
> -
> if (!cpu_has_pat) {
> /*
> * If this happens we are on a secondary CPU, but switched to
> @@ -238,18 +231,43 @@ static void pat_ap_init(u64 pat)
> wrmsrl(MSR_IA32_CR_PAT, pat);
> }
>
> -void pat_init(void)
> +/**
> + * pat_keep_handoff_state - Set PAT table to the handoff state
> + *
> + * This function keeps PAT in the BIOS handoff state. When CPU supports
> + * PAT, it sets PAT table to be consistent with PAT MSR. When CPU does not
> + * support PAT, it emulates PAT by setting PAT table consistent with PWT
> + * and PCD bits in a PTE.
> + *
> + * The PAT table is global to all CPUs, which is initialized once at
> + * boot-time. Any subsequent calls to this function have no effect.
> + */
> +static void pat_keep_handoff_state(void)
Static function, no need for "pat_" prefix. Also, no need for the
kernel-doc comment.
Also, no need for all that handoff nomenclature etc, just call it
setup_pat(). Because it does exactly that - it sets up the PAT bits
unconditionally, regardless of enabled or not.
> {
> - u64 pat;
> - struct cpuinfo_x86 *c = &boot_cpu_data;
> + u64 pat = 0;
> + static int set_handoff_done;
s/set_handoff_done/pat_setup_done/
--
Regards/Gruss,
Boris.
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
--
On Tue, 2016-03-22 at 17:57 +0100, Borislav Petkov wrote:
> $Subject is misleading - there's no non-default PAT MSR - the setting is
> non-default.
Right. Will change to "Add support of non-default PAT MSR setting at
handoff".
> On Wed, Mar 16, 2016 at 06:44:57PM -0600, Toshi Kani wrote:
> > In preparation to fix a regression caused by 'commit 9cd25aac1f44
> > ("x86/mm/pat: Emulate PAT when it is disabled")', PAT needs to
> > support a case that PAT MSR is initialized with a non-default
> > value.
> >
> > When pat_init() is called in PAT disable state, it initializes
>
> is called and PAT is disabled
Will do.
> > PAT table with the BIOS default value. Xen, however, sets PAT MSR
> > with a non-default value to enable WC. This causes inconsistency
> > between PAT table and PAT MSR when PAT is set to disable on Xen.
> >
> > Change pat_init() to handle the PAT disable cases properly. Add
> > pat_keep_handoff_state() to handle two cases when PAT is set to
> > disable.
> > 1. CPU supports PAT: Set PAT table to be consistent with PAT MSR.
> > 2. CPU does not support PAT: Set PAT table to be consistent with
> > PWT and PCD bits in a PTE.
> >
:
> > +/**
> > + * pat_keep_handoff_state - Set PAT table to the handoff state
> > + *
> > + * This function keeps PAT in the BIOS handoff state. When CPU
> > supports
> > + * PAT, it sets PAT table to be consistent with PAT MSR. When CPU does
> > not
> > + * support PAT, it emulates PAT by setting PAT table consistent with
> > PWT
> > + * and PCD bits in a PTE.
> > + *
> > + * The PAT table is global to all CPUs, which is initialized once at
> > + * boot-time. Any subsequent calls to this function have no effect.
> > + */
> > +static void pat_keep_handoff_state(void)
>
> Static function, no need for "pat_" prefix. Also, no need for the
> kernel-doc comment.
>
> Also, no need for all that handoff nomenclature etc, just call it
> setup_pat(). Because it does exactly that - it sets up the PAT bits
> unconditionally, regardless of enabled or not.
I'd like to make it clear that this function does not set PAT MSR, unlike
what pat_init() does. When CPU supports PAT, it keeps PAT MSR in whatever
the setting at handoff, and initializes PAT table to match with this
setting.
I am open to a better name, but I am afraid that setup_pat() can be
confusing as if it sets PAT MSR.
> > {
> > - u64 pat;
> > - struct cpuinfo_x86 *c = &boot_cpu_data;
> > + u64 pat = 0;
> > + static int set_handoff_done;
>
> s/set_handoff_done/pat_setup_done/
I will match it with a func name once we decided.
Thanks,
-Toshi
On Tue, 22 Mar 2016, Borislav Petkov wrote:
> > +static void pat_keep_handoff_state(void)
>
> Static function, no need for "pat_" prefix. Also, no need for the
> kernel-doc comment.
I have to disagree. kernel-doc comments are not limited to global
functions.
I realy prefer kernel-doc style for static functions over randomly formatted
ones for consistency reasons.
Thanks,
tglx
On Tue, Mar 22, 2016 at 12:35:19PM -0600, Toshi Kani wrote:
> Right. Will change to "Add support of non-default PAT MSR setting at
> handoff".
Please remove this "handoff" notion from the text. Every hw register is
being handed off to the OS once the kernel takes over so there's no need
to make it special here.
> I'd like to make it clear that this function does not set PAT MSR, unlike
> what pat_init() does. When CPU supports PAT, it keeps PAT MSR in whatever
> the setting at handoff, and initializes PAT table to match with this
> setting.
>
> I am open to a better name, but I am afraid that setup_pat() can be
> confusing as if it sets PAT MSR.
So call it init_cache_modes() and rename the current
pat_init_cache_modes() to __init_cache_modes() to denote it is a lower
level helper of the init_cache_modes() one. The init_cache_modes() one
deals with the higher level figuring out of whether PAT is enabled and
if not, preparing the attr bits for emulation. In the end, it calls
__init_cache_modes(). All nice and easy.
--
Regards/Gruss,
Boris.
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
--
On Wed, 2016-03-23 at 09:43 +0100, Borislav Petkov wrote:
> On Tue, Mar 22, 2016 at 12:35:19PM -0600, Toshi Kani wrote:
> > Right. Will change to "Add support of non-default PAT MSR setting at
> > handoff".
>
> Please remove this "handoff" notion from the text. Every hw register is
> being handed off to the OS once the kernel takes over so there's no need
> to make it special here.
Will do.
> > I'd like to make it clear that this function does not set PAT MSR,
> > unlike what pat_init() does. When CPU supports PAT, it keeps PAT MSR
> > in whatever the setting at handoff, and initializes PAT table to match
> > with this setting.
> >
> > I am open to a better name, but I am afraid that setup_pat() can be
> > confusing as if it sets PAT MSR.
>
> So call it init_cache_modes() and rename the current
> pat_init_cache_modes() to __init_cache_modes() to denote it is a lower
> level helper of the init_cache_modes() one. The init_cache_modes() one
> deals with the higher level figuring out of whether PAT is enabled and
> if not, preparing the attr bits for emulation. In the end, it calls
> __init_cache_modes(). All nice and easy.
Good idea! Will do.
Thanks,
-Toshi