Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932775AbcCJWZF (ORCPT ); Thu, 10 Mar 2016 17:25:05 -0500 Received: from g4t3427.houston.hp.com ([15.201.208.55]:58757 "EHLO g4t3427.houston.hp.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932436AbcCJWZD (ORCPT ); Thu, 10 Mar 2016 17:25:03 -0500 Message-ID: <1457651856.15454.581.camel@hpe.com> Subject: Re: runtime regression with "x86/mm/pat: Emulate PAT when it is disabled" From: Toshi Kani To: Borislav Petkov Cc: Paul Gortmaker , Richard Purdie , Toshi Kani , Bruce Ashfield , "Hart, Darren" , "saul.wold" , linux-kernel@vger.kernel.org Date: Thu, 10 Mar 2016 16:17:36 -0700 In-Reply-To: <20160310210718.GE2194@pd.tnic> References: <1457393912.15454.419.camel@hpe.com> <20160307235328.GD26051@windriver.com> <1457398578.15454.421.camel@hpe.com> <1457400913.15454.435.camel@hpe.com> <20160310144250.GG23251@windriver.com> <1457628591.15454.542.camel@hpe.com> <20160310172029.GA2194@pd.tnic> <1457640261.15454.551.camel@hpe.com> <20160310192053.GD2194@pd.tnic> <1457641451.15454.561.camel@hpe.com> <20160310210718.GE2194@pd.tnic> Content-Type: multipart/mixed; boundary="=-minLWZzwUSm1jAxg1rC9" X-Mailer: Evolution 3.18.4 (3.18.4-1.fc23) Mime-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 9942 Lines: 342 --=-minLWZzwUSm1jAxg1rC9 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit On Thu, 2016-03-10 at 22:07 +0100, Borislav Petkov wrote: > On Thu, Mar 10, 2016 at 01:24:11PM -0700, Toshi Kani wrote: > > I am not familiar with PPRO_FEATURES, > > That's the feature bits of the "qemu32" model, and others, in qemu. > > > but shouldn't 'flags' in /proc/cpuinfo show "pat" when X86_FEATURE_PAT > > is set? > > static void early_init_intel(struct cpuinfo_x86 *c) > ... > >         /* >          * There is a known erratum on Pentium III and Core Solo >          * and Core Duo CPUs. >          * " Page with PAT set to WC while associated MTRR is UC >          *   may consolidate to UC " >          * Because of this erratum, it is better to stick with >          * setting WC in MTRR rather than using PAT on these CPUs. >          * >          * Enable PAT WC only on P4, Core 2 or later CPUs. >          */ >         if (c->x86 == 6 && c->x86_model < 15) >                 clear_cpu_cap(c, X86_FEATURE_PAT); > --- > > which also gives a hint as to how we should fix this: pat_enabled() > needs to look at that feature bit too: I see.  I will take a look. > --- > diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c > index faec01e7a17d..359c30d9a78c 100644 > --- a/arch/x86/mm/pat.c > +++ b/arch/x86/mm/pat.c > @@ -56,7 +56,7 @@ early_param("nopat", nopat); >   >  bool pat_enabled(void) >  { > - return !!__pat_enabled; > + return !!__pat_enabled && static_cpu_has(X86_FEATURE_PAT); >  } >  EXPORT_SYMBOL_GPL(pat_enabled); > --- > > Makes sense? Yes, I agree that pat_enable() needs to check the PAT feature bit.  In some reason, static_cpu_has(X86_FEATURE_PAT) returns 0 while cpu_has_pat returns 1 in my testing...  I need to check this. > > pat_init() is being called as part of MTRR setup because PAT > > initialization requires the same CPU rendezvous operation implemented > > in the MTRR code. > > ... which means, PAT depends on MTRR being present. Yes, and we need more changes to handle this dependency since MTRR does not call pat_init() when it is disabled. Attached is the changes I am working now.  I will include your changes, and send them out once I finished testing.  Let me know if you have any suggestion. Thanks, -Toshi --=-minLWZzwUSm1jAxg1rC9 Content-Type: message/rfc822; name="01-pat-disable" Content-Disposition: attachment; filename="01-pat-disable" From: Toshi Kani Date: Thu, 10 Mar 2016 16:04:11 -0700 Subject: No Subject Message-ID: <1457651051.15454.573.camel@hpe.com> Mime-Version: 1.0 Content-Transfer-Encoding: 7bit --- arch/x86/include/asm/pat.h | 1 + arch/x86/mm/pat.c | 84 +++++++++++++++++++++++++++----------------- 2 files changed, 52 insertions(+), 33 deletions(-) diff --git a/arch/x86/include/asm/pat.h b/arch/x86/include/asm/pat.h index ca6c228..016142b 100644 --- a/arch/x86/include/asm/pat.h +++ b/arch/x86/include/asm/pat.h @@ -5,6 +5,7 @@ #include bool pat_enabled(void); +void pat_disable(const char *reason); extern void pat_init(void); void pat_init_cache_modes(u64); diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c index f4ae536..af2b3d8 100644 --- a/arch/x86/mm/pat.c +++ b/arch/x86/mm/pat.c @@ -40,11 +40,19 @@ static bool boot_cpu_done; static int __read_mostly __pat_enabled = IS_ENABLED(CONFIG_X86_PAT); +static void pat_disable_init(void); -static inline void pat_disable(const char *reason) +void pat_disable(const char *reason) { + if (boot_cpu_done) { + pr_info("x86/PAT: PAT cannot be disabled after initialized\n"); + return; + } + __pat_enabled = 0; pr_info("x86/PAT: %s\n", reason); + + pat_disable_init(); } static int __init nopat(char *str) @@ -207,9 +215,6 @@ static void pat_bsp_init(u64 pat) return; } - if (!pat_enabled()) - goto done; - rdmsrl(MSR_IA32_CR_PAT, tmp_pat); if (!tmp_pat) { pat_disable("PAT MSR is 0, disabled."); @@ -218,15 +223,11 @@ static void pat_bsp_init(u64 pat) wrmsrl(MSR_IA32_CR_PAT, pat); -done: pat_init_cache_modes(pat); } static void pat_ap_init(u64 pat) { - if (!pat_enabled()) - return; - if (!cpu_has_pat) { /* * If this happens we are on a secondary CPU, but switched to @@ -238,38 +239,55 @@ static void pat_ap_init(u64 pat) wrmsrl(MSR_IA32_CR_PAT, pat); } +static void pat_disable_init(void) +{ + u64 pat; + static int disable_init_done = 0; + + if (disable_init_done) + return; + + /* + * No PAT. Emulate the PAT table that corresponds to the two + * cache bits, PWT (Write Through) and PCD (Cache Disable). This + * setup is the same as the BIOS default setup when the system + * has PAT but the "nopat" boot option has been specified. This + * emulated PAT table is used when MSR_IA32_CR_PAT returns 0. + * + * PTE encoding: + * + * PCD + * |PWT PAT + * || slot + * 00 0 WB : _PAGE_CACHE_MODE_WB + * 01 1 WT : _PAGE_CACHE_MODE_WT + * 10 2 UC-: _PAGE_CACHE_MODE_UC_MINUS + * 11 3 UC : _PAGE_CACHE_MODE_UC + * + * NOTE: When WC or WP is used, it is redirected to UC- per + * the default setup in __cachemode2pte_tbl[]. + */ + pat = PAT(0, WB) | PAT(1, WT) | PAT(2, UC_MINUS) | PAT(3, UC) | + PAT(4, WB) | PAT(5, WT) | PAT(6, UC_MINUS) | PAT(7, UC); + + pat_init_cache_modes(pat); + + disable_init_done = 1; +} + void pat_init(void) { u64 pat; struct cpuinfo_x86 *c = &boot_cpu_data; if (!pat_enabled()) { - /* - * No PAT. Emulate the PAT table that corresponds to the two - * cache bits, PWT (Write Through) and PCD (Cache Disable). This - * setup is the same as the BIOS default setup when the system - * has PAT but the "nopat" boot option has been specified. This - * emulated PAT table is used when MSR_IA32_CR_PAT returns 0. - * - * PTE encoding: - * - * PCD - * |PWT PAT - * || slot - * 00 0 WB : _PAGE_CACHE_MODE_WB - * 01 1 WT : _PAGE_CACHE_MODE_WT - * 10 2 UC-: _PAGE_CACHE_MODE_UC_MINUS - * 11 3 UC : _PAGE_CACHE_MODE_UC - * - * NOTE: When WC or WP is used, it is redirected to UC- per - * the default setup in __cachemode2pte_tbl[]. - */ - pat = PAT(0, WB) | PAT(1, WT) | PAT(2, UC_MINUS) | PAT(3, UC) | - PAT(4, WB) | PAT(5, WT) | PAT(6, UC_MINUS) | PAT(7, UC); + pat_disable_init(); + return; + } - } else if ((c->x86_vendor == X86_VENDOR_INTEL) && - (((c->x86 == 0x6) && (c->x86_model <= 0xd)) || - ((c->x86 == 0xf) && (c->x86_model <= 0x6)))) { + if ((c->x86_vendor == X86_VENDOR_INTEL) && + (((c->x86 == 0x6) && (c->x86_model <= 0xd)) || + ((c->x86 == 0xf) && (c->x86_model <= 0x6)))) { /* * PAT support with the lower four entries. Intel Pentium 2, * 3, M, and 4 are affected by PAT errata, which makes the --=-minLWZzwUSm1jAxg1rC9 Content-Type: message/rfc822; name="02-mtrr" Content-Disposition: attachment; filename="02-mtrr" From: Toshi Kani Date: Thu, 10 Mar 2016 16:04:11 -0700 Subject: No Subject Message-ID: <1457651051.15454.574.camel@hpe.com> Mime-Version: 1.0 Content-Transfer-Encoding: 7bit --- arch/x86/kernel/cpu/mtrr/generic.c | 24 ++++++++++++++---------- arch/x86/kernel/cpu/mtrr/main.c | 13 ++++++++++++- arch/x86/kernel/cpu/mtrr/mtrr.h | 1 + 3 files changed, 27 insertions(+), 11 deletions(-) diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c index c870af1..136ae86 100644 --- a/arch/x86/kernel/cpu/mtrr/generic.c +++ b/arch/x86/kernel/cpu/mtrr/generic.c @@ -444,11 +444,24 @@ static void __init print_mtrr_state(void) pr_debug("TOM2: %016llx aka %lldM\n", mtrr_tom2, mtrr_tom2>>20); } +/* PAT setup for BP. We need to go through sync steps here */ +void __init mtrr_bp_pat_init(void) +{ + unsigned long flags; + + local_irq_save(flags); + prepare_set(); + + pat_init(); + + post_set(); + local_irq_restore(flags); +} + /* Grab all of the MTRR state for this CPU into *state */ bool __init get_mtrr_state(void) { struct mtrr_var_range *vrs; - unsigned long flags; unsigned lo, dummy; unsigned int i; @@ -481,15 +494,6 @@ bool __init get_mtrr_state(void) mtrr_state_set = 1; - /* PAT setup for BP. We need to go through sync steps here */ - local_irq_save(flags); - prepare_set(); - - pat_init(); - - post_set(); - local_irq_restore(flags); - return !!(mtrr_state.enabled & MTRR_STATE_MTRR_ENABLED); } diff --git a/arch/x86/kernel/cpu/mtrr/main.c b/arch/x86/kernel/cpu/mtrr/main.c index 5c3d149..d9e91f1 100644 --- a/arch/x86/kernel/cpu/mtrr/main.c +++ b/arch/x86/kernel/cpu/mtrr/main.c @@ -752,6 +752,9 @@ void __init mtrr_bp_init(void) /* BIOS may override */ __mtrr_enabled = get_mtrr_state(); + if (mtrr_enabled()) + mtrr_bp_pat_init(); + if (mtrr_cleanup(phys_addr)) { changed_by_mtrr_cleanup = 1; mtrr_if->set_all(); @@ -759,8 +762,16 @@ void __init mtrr_bp_init(void) } } - if (!mtrr_enabled()) + if (!mtrr_enabled()) { pr_info("MTRR: Disabled\n"); + + /* + * PAT initialization relies on MTRR's rendezvous handler. + * Disable PAT until the handler can initialize both features + * independently. + */ + pat_disable("PAT disabled by MTRR"); + } } void mtrr_ap_init(void) diff --git a/arch/x86/kernel/cpu/mtrr/mtrr.h b/arch/x86/kernel/cpu/mtrr/mtrr.h index 951884d..6c7ced0 100644 --- a/arch/x86/kernel/cpu/mtrr/mtrr.h +++ b/arch/x86/kernel/cpu/mtrr/mtrr.h @@ -52,6 +52,7 @@ void set_mtrr_prepare_save(struct set_mtrr_context *ctxt); void fill_mtrr_var_range(unsigned int index, u32 base_lo, u32 base_hi, u32 mask_lo, u32 mask_hi); bool get_mtrr_state(void); +void mtrr_bp_pat_init(void); extern void set_mtrr_ops(const struct mtrr_ops *ops); --=-minLWZzwUSm1jAxg1rC9--