Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752184AbbESWpA (ORCPT ); Tue, 19 May 2015 18:45:00 -0400 Received: from mail-ig0-f172.google.com ([209.85.213.172]:35036 "EHLO mail-ig0-f172.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751707AbbESWo5 (ORCPT ); Tue, 19 May 2015 18:44:57 -0400 Date: Tue, 19 May 2015 17:44:52 -0500 From: Bjorn Helgaas To: "Luis R. Rodriguez" Cc: mst@redhat.com, plagnioj@jcrosoft.com, tomi.valkeinen@ti.com, airlied@linux.ie, daniel.vetter@intel.com, linux-fbdev@vger.kernel.org, luto@amacapital.net, cocci@systeme.lip6.fr, linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org, "Luis R. Rodriguez" , Toshi Kani , Suresh Siddha , Ingo Molnar , Thomas Gleixner , Juergen Gross , Daniel Vetter , Dave Airlie , Antonino Daplas , Dave Hansen , Arnd Bergmann , venkatesh.pallipadi@intel.com, Stefan Bader , Ville =?iso-8859-1?Q?Syrj=E4l=E4?= , Mel Gorman , Vlastimil Babka , Borislav Petkov , Davidlohr Bueso , konrad.wilk@oracle.com, ville.syrjala@linux.intel.com, david.vrabel@citrix.com, jbeulich@suse.com, Roger Pau =?iso-8859-1?Q?Monn=E9?= , xen-devel@lists.xensource.com Subject: Re: [PATCH v5 1/5] pci: add pci_iomap_wc() variants Message-ID: <20150519224452.GT31666@google.com> References: <1430415364-19679-1-git-send-email-mcgrof@do-not-panic.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <1430415364-19679-1-git-send-email-mcgrof@do-not-panic.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 8461 Lines: 215 On Thu, Apr 30, 2015 at 10:36:04AM -0700, Luis R. Rodriguez wrote: > From: "Luis R. Rodriguez" > > This allows drivers to take advantage of write-combining > when possible. The PCI specification does not allow for us > to automatically identify a memory region which needs > write-combining so drivers have to identify these areas > on their own. There is IORESOURCE_PREFETCH but as clarified > by Michael and confirmed later by Bjorn, PCI prefetch bit > merely means bridges can combine writes and prefetch reads. > Prefetch does not affect ordering rules and does not allow > writes to be collapsed [0]. WC is stronger, it allows collapsing > and changes ordering rules. WC can also hurt latency as small > writes are buffered. Because of all this drivers needs to > know what they are doing, we can't set a write-combining > preference flag in the pci core automatically for drivers. > > Lastly although there is also arch_phys_wc_add() this makes > use of architecture specific write-combining *hacks* and > the only one currently defined and used is MTRR for x86. > MTRRs are legacy, limited in number, have restrictive size > constraints, and are known to interact pooly with the BIOS. > MTRRs should only really be considered on old video framebuffer > drivers. If we made ioremap_wc() and similar calls start > automatically adding MTRRs, then performance will vary wildly > with the order of driver loading because we'll run out of MTRRs > part-way through bootup. > > There are a few motivations for phasing out of MTRR and > helping driver change over to use write-combining with PAT: > > a) Take advantage of PAT when available > > b) Help bury MTRR code away, MTRR is architecture specific and on > x86 its replaced by PAT > > c) Help with the goal of eventually using _PAGE_CACHE_UC over > _PAGE_CACHE_UC_MINUS on x86 on ioremap_nocache() (see commit > de33c442e titled "x86 PAT: fix performance drop for glx, > use UC minus for ioremap(), ioremap_nocache() and > pci_mmap_page_range()") > > [0] https://lkml.org/lkml/2015/4/21/714 > > Cc: Toshi Kani > Cc: Andy Lutomirski > Cc: Suresh Siddha > Cc: Ingo Molnar > Cc: Thomas Gleixner > Cc: Juergen Gross > Cc: Daniel Vetter > Cc: Dave Airlie > Cc: Bjorn Helgaas > Cc: Antonino Daplas > Cc: Jean-Christophe Plagniol-Villard > Cc: Tomi Valkeinen > Cc: Dave Hansen > Cc: Arnd Bergmann > Cc: Michael S. Tsirkin > Cc: venkatesh.pallipadi@intel.com > Cc: Stefan Bader > Cc: Ville Syrj?l? > Cc: Mel Gorman > Cc: Vlastimil Babka > Cc: Borislav Petkov > Cc: Davidlohr Bueso > Cc: konrad.wilk@oracle.com > Cc: ville.syrjala@linux.intel.com > Cc: david.vrabel@citrix.com > Cc: jbeulich@suse.com > Cc: Roger Pau Monn? > Cc: linux-fbdev@vger.kernel.org > Cc: linux-kernel@vger.kernel.org > Cc: xen-devel@lists.xensource.com > Cc: linux-pci@vger.kernel.org > Signed-off-by: Luis R. Rodriguez Acked-by: Bjorn Helgaas > --- > > This v5 makes the code return NULL for IORESOURCE_IO and fixes the commit > log to clarify the conclusions reached for MTRR and our review of > IORESOURCE_PREFETCH. > > include/asm-generic/pci_iomap.h | 14 ++++++++++ > lib/pci_iomap.c | 61 +++++++++++++++++++++++++++++++++++++++++ > 2 files changed, 75 insertions(+) > > diff --git a/include/asm-generic/pci_iomap.h b/include/asm-generic/pci_iomap.h > index 7389c87..b1e17fc 100644 > --- a/include/asm-generic/pci_iomap.h > +++ b/include/asm-generic/pci_iomap.h > @@ -15,9 +15,13 @@ struct pci_dev; > #ifdef CONFIG_PCI > /* Create a virtual mapping cookie for a PCI BAR (memory or IO) */ > extern void __iomem *pci_iomap(struct pci_dev *dev, int bar, unsigned long max); > +extern void __iomem *pci_iomap_wc(struct pci_dev *dev, int bar, unsigned long max); > extern void __iomem *pci_iomap_range(struct pci_dev *dev, int bar, > unsigned long offset, > unsigned long maxlen); > +extern void __iomem *pci_iomap_wc_range(struct pci_dev *dev, int bar, > + unsigned long offset, > + unsigned long maxlen); > /* Create a virtual mapping cookie for a port on a given PCI device. > * Do not call this directly, it exists to make it easier for architectures > * to override */ > @@ -34,12 +38,22 @@ static inline void __iomem *pci_iomap(struct pci_dev *dev, int bar, unsigned lon > return NULL; > } > > +static inline void __iomem *pci_iomap_wc(struct pci_dev *dev, int bar, unsigned long max) > +{ > + return NULL; > +} > static inline void __iomem *pci_iomap_range(struct pci_dev *dev, int bar, > unsigned long offset, > unsigned long maxlen) > { > return NULL; > } > +static inline void __iomem *pci_iomap_wc_range(struct pci_dev *dev, int bar, > + unsigned long offset, > + unsigned long maxlen) > +{ > + return NULL; > +} > #endif > > #endif /* __ASM_GENERIC_IO_H */ > diff --git a/lib/pci_iomap.c b/lib/pci_iomap.c > index bcce5f1..9604dcb 100644 > --- a/lib/pci_iomap.c > +++ b/lib/pci_iomap.c > @@ -52,6 +52,46 @@ void __iomem *pci_iomap_range(struct pci_dev *dev, > EXPORT_SYMBOL(pci_iomap_range); > > /** > + * pci_iomap_wc_range - create a virtual WC mapping cookie for a PCI BAR > + * @dev: PCI device that owns the BAR > + * @bar: BAR number > + * @offset: map memory at the given offset in BAR > + * @maxlen: max length of the memory to map > + * > + * Using this function you will get a __iomem address to your device BAR. > + * You can access it using ioread*() and iowrite*(). These functions hide > + * the details if this is a MMIO or PIO address space and will just do what > + * you expect from them in the correct way. When possible write combining > + * is used. > + * > + * @maxlen specifies the maximum length to map. If you want to get access to > + * the complete BAR from offset to the end, pass %0 here. > + * */ > +void __iomem *pci_iomap_wc_range(struct pci_dev *dev, > + int bar, > + unsigned long offset, > + unsigned long maxlen) > +{ > + resource_size_t start = pci_resource_start(dev, bar); > + resource_size_t len = pci_resource_len(dev, bar); > + unsigned long flags = pci_resource_flags(dev, bar); > + > + if (len <= offset || !start) > + return NULL; > + len -= offset; > + start += offset; > + if (maxlen && len > maxlen) > + len = maxlen; > + if (flags & IORESOURCE_IO) > + return NULL; > + if (flags & IORESOURCE_MEM) > + return ioremap_wc(start, len); > + /* What? */ > + return NULL; > +} > +EXPORT_SYMBOL_GPL(pci_iomap_wc_range); > + > +/** > * pci_iomap - create a virtual mapping cookie for a PCI BAR > * @dev: PCI device that owns the BAR > * @bar: BAR number > @@ -70,4 +110,25 @@ void __iomem *pci_iomap(struct pci_dev *dev, int bar, unsigned long maxlen) > return pci_iomap_range(dev, bar, 0, maxlen); > } > EXPORT_SYMBOL(pci_iomap); > + > +/** > + * pci_iomap_wc - create a virtual WC mapping cookie for a PCI BAR > + * @dev: PCI device that owns the BAR > + * @bar: BAR number > + * @maxlen: length of the memory to map > + * > + * Using this function you will get a __iomem address to your device BAR. > + * You can access it using ioread*() and iowrite*(). These functions hide > + * the details if this is a MMIO or PIO address space and will just do what > + * you expect from them in the correct way. When possible write combining > + * is used. > + * > + * @maxlen specifies the maximum length to map. If you want to get access to > + * the complete BAR without checking for its length first, pass %0 here. > + * */ > +void __iomem *pci_iomap_wc(struct pci_dev *dev, int bar, unsigned long maxlen) > +{ > + return pci_iomap_wc_range(dev, bar, 0, maxlen); > +} > +EXPORT_SYMBOL_GPL(pci_iomap_wc); > #endif /* CONFIG_PCI */ > -- > 2.3.2.209.gd67f9d5.dirty > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/