Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758478AbcC2XYX (ORCPT ); Tue, 29 Mar 2016 19:24:23 -0400 Received: from g4t3428.houston.hp.com ([15.201.208.56]:6582 "EHLO g4t3428.houston.hp.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753775AbcC2XYV (ORCPT ); Tue, 29 Mar 2016 19:24:21 -0400 Message-ID: <1459296994.6393.748.camel@hpe.com> Subject: Re: [PATCH 2/2] x86/mtrr: Refactor PAT initialization code From: Toshi Kani To: "Luis R. Rodriguez" Cc: Boris Ostrovsky , "xen-devel@lists.xensource.com" , Thomas Gleixner , Matt Fleming , Paul Gortmaker , Borislav Petkov , X86 ML , Paul Stewart , Ingo Molnar , Olof Johansson , "linux-kernel@vger.kernel.org" , "H. Peter Anvin" , Juergen Gross , Jan Beulich , Stuart Hayes , Yinghai Lu , Prarit Bhargava Date: Tue, 29 Mar 2016 18:16:34 -0600 In-Reply-To: References: <1457671546-13486-1-git-send-email-toshi.kani@hpe.com> <1457671546-13486-3-git-send-email-toshi.kani@hpe.com> <20160311092400.GB4347@pd.tnic> <1457722632.6393.130.camel@hpe.com> <20160311221747.GC25147@wotan.suse.de> <1457740571.6393.236.camel@hpe.com> <1457745396.6393.257.camel@hpe.com> <20160315001501.GF25147@wotan.suse.de> <1458085724.6393.425.camel@hpe.com> <20160315232916.GJ1990@wotan.suse.de> <1458251807.6393.474.camel@hpe.com> <1458336958.6393.544.camel@hpe.com> <1459288009.6393.699.camel@hpe.com> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.18.4 (3.18.4-1.fc23) Mime-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4517 Lines: 102 On Tue, 2016-03-29 at 15:12 -0700, Luis R. Rodriguez wrote: > On Tue, Mar 29, 2016 at 2:46 PM, Toshi Kani wrote: > > On Tue, 2016-03-29 at 10:14 -0700, Luis R. Rodriguez wrote: > > > On Fri, Mar 18, 2016 at 2:35 PM, Toshi Kani > > > wrote:  : > > > > > > Do we really need UC for the fan? > > > > When you say "we", are you referring Xen guests?  Xen guests do not > > need to control the fan, so they do not need UC set in MTRRs. > > > > In general, yes, MMIO registers need UC when they need to be accessed. > > Curious, what does a BIOS do for fan control when MTRRs are disabled? You mean, when the kernel modified the MTRR setup and disabled them.  BIOS would assume the original setup and still access the registers.  This may lead to undefined behavior and may result in a system crash. > Also what if a BIOS just set MSR_MTRRdefType to uncachable only ? Many BIOSes actually set the default type to UC.  MTRRs then cover regular memory with WB. > Wouldn't that help simplify the BIOS when systems are known as not > wanting to deal with reading MTRRs on the kernel front, even if its > just to read the setup ? Nope. > I'm trying to determine exactly why a BIOS cannot simply enable use an > alternative for what it needs for fan control and let the kernel live > without any MTRR code at run time as an option. Although the > documentation says that the same "procedure" is needed for PAT setup, > I see it possible to split the skeleton of the code and have each > peace of code live separately and compartmentalized, they'd just have > respective calls on the skeleton of the procedure. I agree that the MTRR rendezvous handler can be improved for PAT, but I do not see a compelling reason to make such change now.  With my fix, I think the code works reasonably for Xen. > > > What is the default for PAT? > > > > There is no such thing as the default for PAT. > > > > > Can't > > > the same be used so that we way by default all ranges match what is > > > also the default by PAT? Would that really break fan control ? If we > > > have a match should't we be able to not have to worry about MTRRs at > > > all in-kernel even on bare metal? > > > > We do not need to know about BIOS impl, such as fan control, etc.  The > > point is that if BIOS sets MTRRs, then the kernel keeps their setup. > > Right, if the kernel no longer uses it directly it seems like an > aweful lot of code to keep updating simply for a BIOS requirement, I'm > trying to see if we can have the option to live without this > requirement. Please be aware of the hibernation case. I think this procedure involves setting MTRRs back to the original setup. > > If (virtual) BIOS does not enable MTRRs, the kernel keeps them > > disabled.  We just need not to mess with the setup. > > Sure, thanks! I'm trying to see if we can have a similar option on bare > metal. > > > > Another option, which I've alluded to on the Xen thread is skipping > > > over the MTRR space from the e820 map. Is that not possible ? This > > > could be last resort... but which I'm hinting more for the Xen side > > > of things if we *really* need get_mtrr() on the Xen guest side of > > > things... > > > > There is no MTRR space in the e820 map since they are MSRs.  Since Xen > > guests disable MTRRs, I do not think you have any issue here... > > Xen seems to clip the e820 map given to a guest in certain MTRR > conditions, see init_e820(), this calls > machine_specific_memory_setup() which later clips MTRR if > mtrr_top_of_ram(). This is an Intel check that trims the e820 map if > MTRRs were found to be enabled and the default MTRR is not write-back. > If returns the address of the first non write-back variable MTRR, it > uses clip_to_limit() to limit the exposed memory [0], notice how > clip_to_limit() is also used to generally limit exposed memory through > the opt_mem boot parameter as well. Its not exactly clear why that's > done, but this looks very similar to the Linux MTRR cleanup -- see > x86_get_mtrr_mem_range(). > > [0] http://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=xen/arch/x86/e820.c It looks to me that the code makes sure all E820_RAM ranges in the e820 table are covered by WB entries of MTRRs.  If not, it trims the e820 table. I suppose it tries to react on a case when someone modified MTRRs and resulted in mismatch with the e820 table.  I'd think you do not need this code as long as you do not modify the MTRR setup. Thanks, -Toshi