2015-06-25 01:36:41

by Luis Chamberlain

[permalink] [raw]
Subject: [PATCH v5 0/3] atyfb: address MTRR corner case

From: "Luis R. Rodriguez" <[email protected]>

Andrew,

Forgive me for the TL;DR, I'm afraid I need to be crystal clear on this
patchset as its the most complex in the entire series. The skinny is that this
patchset addresses a complex work around with APIs now merged upstream going in
for v4.2, the driver maintainer hasn't followed up with the driver changes for
over a month and no one else has provided Acks for these device driver
changes [0]. We have a few options:

0) Sit and wait for a driver maintainer to review this
1) Merge this as-is and hope for reports
2) go with the nopat requirement as with the ivtv and ipath driver

I'd prefer to merge this as is, and only if reports come back with
issues should we then consider 2) as we'd then have at least a well
documented work effort required for this transformation. This device
driver is also old, so I don't expect much reports anyway.

----

The TL;DR:

As part of the long haul effort to rid the world of direct MTRR use [1] we've
have had to also work on alternative solutions which can co-exist with PAT
interfaces. Most of the transformation of device drivers to use PAT was fairly
easy (TM): so long as ioremap_wc() was used we could then convert over the
drivers using mtrr_add() over to the arch-agnostic and PAT-aware (ignored when
PAT is enabled) arch_phys_wc_add(). This was typically easy to do, for instance
in cases where a full PCI BAR was used for MMIO registers and another PCI BAR
was used with write-combining effects desirable. In some cases we just needed
new WC apis for some buses. This was the case for most modern devices, but a
few old devices had a combined set of MMIO registers and the write-combined
area mixed. In such situations even when using MTRR one had to figure out
creative solutions to make things work, specially considering MTRRs were
limited and they had size constraints: an MTRR base and size must be
a power of two.

The good news is that on Linux there were only three device drivers in total
that we ended up with radical issue with when converting them over to PAT
interfaces. One was with the ivtv media device driver, another was the
infiniband ipath device driver. The other one was the framebuffer atyfb
device driver that this series addresses. For both ivtv and ipath we've
decided to simply require users of those devices to boot with the nopat
kernel parameter because both devices drivers are ancient and the work
required to fully convert to PAT interfaces is significant (in the ipath case)
or nearly almost impossible (ivtv). For details please refer to the respective
and now upstream commits:

7ea402d x86/mm/pat, drivers/infiniband/ipath: Use arch_phys_wc_add() and require PAT disabled
1bf1735 x86/mm/pat, drivers/media/ivtv: Use arch_phys_wc_add() and require PAT disabled

To demo exactly how much effort would have been required I decided to venture
into atyfb and try to fix that device driver first, considering it had the
worst case situation to address as it used size hackery and MTRR combinations
of different types. In order to accomplish this we needed to map out all
possible combinatorial effects of PAT page entries with write-combining, and
page attributes (PAT, PCD, PWT) with write-combining effects for non-PAT
systems. We did this not only for atyfb's sake but also for any other possible
future driver which might meet these same needs. We needed to take this a bit
more seriously given that our long term goal was also to change the default
behaviour of ioremap_nocache() to use strong UC instead of UC-, we needed
to take this into consideration when converting drivers over. The documentation
table for all these possible combinatorial entries is now upstream:

2f9e897 x86/mm/mtrr, pat: Document Write Combining MTRR type effects on PAT / non-PAT pages

Of importance to this patch set is this table:

----------------------------------------------------------------------
MTRR Non-PAT PAT Linux ioremap value Effective memory type
----------------------------------------------------------------------
Non-PAT | PAT
PAT
|PCD
||PWT
|||
WC 000 WB _PAGE_CACHE_MODE_WB WC | WC
WC 001 WC _PAGE_CACHE_MODE_WC WC* | WC
WC 010 UC- _PAGE_CACHE_MODE_UC_MINUS WC* | UC
WC 011 UC _PAGE_CACHE_MODE_UC UC | UC
----------------------------------------------------------------------

In the atyfb case it used to use two MTRR calls, a large WC MTRR followed by a
UC MTRR "hole" call for the MMIO registers. This was done this way on atyfb
because of the offset and size of the framebuffer area would only work well
this way, otherwise you'd also have to try a series of small MTRR calls and you
might end up running out of MTRRs. For non-PAT systems we take advantage of the
above map to protect an MMIO region with 011 page attributes (this maps to
strong UC for PAT systems) so that if a large MTRR is issued that encompasses
the MMIO region, the MMIO region remains with an effective UC type, while the
desired wc area would have an effective memory type of WC as its region would
have been mapped with ioremap_wc(). This makes use of the newly introduced
ioremap_uc() as otherwise if we would have used ioremap_nocache() we would
not get the UC effective memeory type. This also let us remove the only UC MTRR
call from the kernel :) after this then we'd only have WC MTRR calls in effect
on non-PAT systems on Linux. The more complex patch then is patch 2 which does
most of the magic.

[0] http://lkml.kernel.org/r/CAB=NE6Xy1UGAqZ8CsVc+JzKqsxREaXBYK+1GjZKN8d2FG8xqJg@mail.gmail.com
[1] http://lkml.kernel.org/r/CAB=NE6UgtdSoBsA=8+ueYRAZHDnWUSmQAoHhAaefqudBrSY7Zw@mail.gmail.com

Luis R. Rodriguez (3):
video: fbdev: atyfb: clarify ioremap() base and length used
video: fbdev: atyfb: replace MTRR UC hole with strong UC
video: fbdev: atyfb: use arch_phys_wc_add() and ioremap_wc()

drivers/video/fbdev/aty/atyfb.h | 5 +--
drivers/video/fbdev/aty/atyfb_base.c | 74 +++++++++++-------------------------
2 files changed, 24 insertions(+), 55 deletions(-)

--
2.3.2.209.gd67f9d5.dirty


2015-06-25 01:38:48

by Luis Chamberlain

[permalink] [raw]
Subject: [PATCH v5 1/3] video: fbdev: atyfb: clarify ioremap() base and length used

From: "Luis R. Rodriguez" <[email protected]>

This has no functional changes, it just adjusts
the ioremap() call for the framebuffer to use
the same values we later use for the framebuffer,
this will make it easier to review the next change.

The size of the framebuffer varies but since this is
for PCI we *know* this defaults to 0x800000.
atyfb_setup_generic() is *only* used on PCI probe.

Cc: Toshi Kani <[email protected]>
Cc: Suresh Siddha <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Dave Airlie <[email protected]>
Cc: Antonino Daplas <[email protected]>
Cc: Jean-Christophe Plagniol-Villard <[email protected]>
Cc: Tomi Valkeinen <[email protected]>
Cc: Ville Syrjälä <[email protected]>
Cc: Rob Clark <[email protected]>
Cc: Mathias Krause <[email protected]>
Cc: Andrzej Hajda <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Luis R. Rodriguez <[email protected]>
---
drivers/video/fbdev/aty/atyfb_base.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/video/fbdev/aty/atyfb_base.c b/drivers/video/fbdev/aty/atyfb_base.c
index 16936bb..8025624 100644
--- a/drivers/video/fbdev/aty/atyfb_base.c
+++ b/drivers/video/fbdev/aty/atyfb_base.c
@@ -3489,7 +3489,9 @@ static int atyfb_setup_generic(struct pci_dev *pdev, struct fb_info *info,

/* Map in frame buffer */
info->fix.smem_start = addr;
- info->screen_base = ioremap(addr, 0x800000);
+ info->fix.smem_len = 0x800000;
+
+ info->screen_base = ioremap(info->fix.smem_start, info->fix.smem_len);
if (info->screen_base == NULL) {
ret = -ENOMEM;
goto atyfb_setup_generic_fail;
--
2.3.2.209.gd67f9d5.dirty

2015-06-25 01:41:05

by Luis Chamberlain

[permalink] [raw]
Subject: [PATCH v5 2/3] video: fbdev: atyfb: replace MTRR UC hole with strong UC

From: "Luis R. Rodriguez" <[email protected]>

Replace a WC MTRR call followed by a UC MTRR "hole" call
with a single WC MTRR call and use strong UC to protect
the MMIO region and account for the device's architecture
and MTRR size requirements.

The atyfb driver relies on two overlapping MTRRs. It
does this to account for the fact that on some devices
it has the MMIO region bundled together with the framebuffer
on the same PCI BAR and the hardware requirement on
MTRRs on both base and size to be powers of two. In the
atyfb driver's case in the worst case the PCI BAR is
of 16 MiB while the MMIO region is on the last 4 KiB of
the same PCI BAR. If we use just one MTRR for WC we can
only end up with an 8 MiB or 16 MiB framebuffer. Using a
16 MiB WC framebuffer area is unacceptable since we need
the MMIO region to not be write-combined. An 8 MiB WC
framebuffer option does not let use quite a bit of framebuffer
space, it would reduce the resolution capability of the device
considerably. An alternative is to use many MTRRs but on
some systems that could mean not having not enough MTRRs
to cover the framebuffer. The current driver solution is
to issue a 16 MiB WC MTRR followed by a 4 KiB UC MTRR on
the last 4 KiB. Its worth mentioning and documenting that
the current ioremap*() strategy as well: the first ioremap()
is used only for the MMIO region, a second ioremap() call
is used for the framebuffer *and* the MMIO region, the MMIO
region then ends up mmap'd twice. Two ioremap() calls are
used since in some situations the framebuffer actually ends
up on a separate auxiliary PCI BAR, but this is not always
true, in the worst case the PCI BAR is shared for both
MMIO and the framebuffer. By allowing overlapping ioremap()
calls the driver enables two types of devices with one
simple ioremap() strategy.

For non PAT systems:

As per Intel SDM "11.5.2.1 Selecting Memory Types for Pentium
Pro and Pentium II Processors" [0] the effect of a WC MTRR for
a region with page attribute settings set to PCD=1, PWT=1
(Linux _PAGE_CACHE_MODE_UC) will render the effective memory
type to UC. A WC MTRR for a region with page attribute settings
set to PCD=1, PWT=0 (Linux _PAGE_CACHE_MODE_UC_MINUS) will render
the effective memory type to WC *but* yet this is considered
implementation defined -- that is, "system designers are
encouraged to avoid these implementation-defined combinations".
A WC MTRR for a region with page attribute settings set to
PCD=0, PWT=1 (Linux _PAGE_CACHE_MODE_WC) will render the
effective memory type to WC *but* this is also implementation
defined. Such is the case for non-PAT systems.

For PAT systems:

As per Intel SDM "11.5.2.2 Selecting Memory Types for Pentium
III and More Recent Processor Families" the ffect of a WC MTRR
for a region with a PAT entry value of UC will be UC. The effect
of a WC MTRR on a region with a PAT entry UC- will be WC. The
effect of a WC MTRR on a regoin with PAT entry WC is WC.

This can all be summarized in the following table:

----------------------------------------------------------------------
MTRR Non-PAT PAT Linux ioremap value Effective memory type
----------------------------------------------------------------------
Non-PAT | PAT
PAT
|PCD
||PWT
|||
WC 000 WB _PAGE_CACHE_MODE_WB WC | WC
WC 001 WC _PAGE_CACHE_MODE_WC WC* | WC
WC 010 UC- _PAGE_CACHE_MODE_UC_MINUS WC* | UC
WC 011 UC _PAGE_CACHE_MODE_UC UC | UC
----------------------------------------------------------------------

(*) denotes implementation defined

By default Linux today defaults both and ioremap_nocache()
to use _PAGE_CACHE_MODE_UC_MINUS. On x86 ioremap() aliases
ioremap_nocache(). The preferred value for Linux by may soon
change however, the goal is to use _PAGE_CACHE_MODE_UC by
default in the future.

We can use ioremap_uc() to set PCD=1, PWT=1 on non-PAT systems
and use a PAT value of UC for PAT systems. This will ensure the
same settings are in place regardless of what Linux decides to
use by default later and to not regress our MTRR strategy since
the effective memory type will differ depending on the value used.
Using a WC MTRR on such an area will be nullified. This technique
can be used to protect the MMIO region in this driver's case and
address the restrictions of the device's architecture as well as
restrictions set upon us by powers of 2 when using MTRRs.

This allows us to replace the two MTRR calls with a single
16 MiB WC MTRR and use page-attribute settings for non-PAT
and PAT entry values for PAT systems to ensure the
appropriate effective memory type won't have a write-combined
effect on the MMIO region on both non-PAT and PAT systems.
The framebuffer area will be sure to get the write-combined
effective memory type by white-listing it with ioremap_wc().

We ensure the desired effective memory types are set by:

0) Using one ioremap_uc() for the MMIO region alone.
This will set the page attribute settings for the MMIO
region to PCD=1, PWT=1 for non-PAT systems while using a
strong UC value on PAT systems.

1) Fixing the framebuffer ioremap'd area to exclude the
MMIO region and using ioremap_wc() instead to whitelist
the area we want for write-combining.

On both cases an implementation defined (as per above table)
effective memory type of WC is used for the framebuffer for
non-PAT systems.

[0] https://www-ssl.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-manual-325462.pdf

Cc: Toshi Kani <[email protected]>
Cc: Suresh Siddha <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Dave Airlie <[email protected]>
Cc: Antonino Daplas <[email protected]>
Cc: Jean-Christophe Plagniol-Villard <[email protected]>
Cc: Tomi Valkeinen <[email protected]>
Cc: Ville Syrjälä <[email protected]>
Cc: Rob Clark <[email protected]>
Cc: Mathias Krause <[email protected]>
Cc: Andrzej Hajda <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Luis R. Rodriguez <[email protected]>
---
drivers/video/fbdev/aty/atyfb.h | 1 -
drivers/video/fbdev/aty/atyfb_base.c | 36 ++++++++++++++----------------------
2 files changed, 14 insertions(+), 23 deletions(-)

diff --git a/drivers/video/fbdev/aty/atyfb.h b/drivers/video/fbdev/aty/atyfb.h
index 1f39a62..89ec439 100644
--- a/drivers/video/fbdev/aty/atyfb.h
+++ b/drivers/video/fbdev/aty/atyfb.h
@@ -184,7 +184,6 @@ struct atyfb_par {
spinlock_t int_lock;
#ifdef CONFIG_MTRR
int mtrr_aper;
- int mtrr_reg;
#endif
u32 mem_cntl;
struct crtc saved_crtc;
diff --git a/drivers/video/fbdev/aty/atyfb_base.c b/drivers/video/fbdev/aty/atyfb_base.c
index 8025624..546f5af 100644
--- a/drivers/video/fbdev/aty/atyfb_base.c
+++ b/drivers/video/fbdev/aty/atyfb_base.c
@@ -2630,21 +2630,13 @@ static int aty_init(struct fb_info *info)

#ifdef CONFIG_MTRR
par->mtrr_aper = -1;
- par->mtrr_reg = -1;
if (!nomtrr) {
- /* Cover the whole resource. */
+ /*
+ * Only the ioremap_wc()'d area will get WC here
+ * since ioremap_uc() was used on the entire PCI BAR.
+ */
par->mtrr_aper = mtrr_add(par->res_start, par->res_size,
MTRR_TYPE_WRCOMB, 1);
- if (par->mtrr_aper >= 0 && !par->aux_start) {
- /* Make a hole for mmio. */
- par->mtrr_reg = mtrr_add(par->res_start + 0x800000 -
- GUI_RESERVE, GUI_RESERVE,
- MTRR_TYPE_UNCACHABLE, 1);
- if (par->mtrr_reg < 0) {
- mtrr_del(par->mtrr_aper, 0, 0);
- par->mtrr_aper = -1;
- }
- }
}
#endif

@@ -2776,10 +2768,6 @@ aty_init_exit:
par->pll_ops->set_pll(info, &par->saved_pll);

#ifdef CONFIG_MTRR
- if (par->mtrr_reg >= 0) {
- mtrr_del(par->mtrr_reg, 0, 0);
- par->mtrr_reg = -1;
- }
if (par->mtrr_aper >= 0) {
mtrr_del(par->mtrr_aper, 0, 0);
par->mtrr_aper = -1;
@@ -3466,7 +3454,11 @@ static int atyfb_setup_generic(struct pci_dev *pdev, struct fb_info *info,
}

info->fix.mmio_start = raddr;
- par->ati_regbase = ioremap(info->fix.mmio_start, 0x1000);
+ /*
+ * By using strong UC we force the MTRR to never have an
+ * effect on the MMIO region on both non-PAT and PAT systems.
+ */
+ par->ati_regbase = ioremap_uc(info->fix.mmio_start, 0x1000);
if (par->ati_regbase == NULL)
return -ENOMEM;

@@ -3491,7 +3483,10 @@ static int atyfb_setup_generic(struct pci_dev *pdev, struct fb_info *info,
info->fix.smem_start = addr;
info->fix.smem_len = 0x800000;

- info->screen_base = ioremap(info->fix.smem_start, info->fix.smem_len);
+ aty_fudge_framebuffer_len(info);
+
+ info->screen_base = ioremap_wc(info->fix.smem_start,
+ info->fix.smem_len);
if (info->screen_base == NULL) {
ret = -ENOMEM;
goto atyfb_setup_generic_fail;
@@ -3563,6 +3558,7 @@ static int atyfb_pci_probe(struct pci_dev *pdev,
return -ENOMEM;
}
par = info->par;
+ par->bus_type = PCI;
info->fix = atyfb_fix;
info->device = &pdev->dev;
par->pci_id = pdev->device;
@@ -3732,10 +3728,6 @@ static void atyfb_remove(struct fb_info *info)
#endif

#ifdef CONFIG_MTRR
- if (par->mtrr_reg >= 0) {
- mtrr_del(par->mtrr_reg, 0, 0);
- par->mtrr_reg = -1;
- }
if (par->mtrr_aper >= 0) {
mtrr_del(par->mtrr_aper, 0, 0);
par->mtrr_aper = -1;
--
2.3.2.209.gd67f9d5.dirty

2015-06-25 01:43:13

by Luis Chamberlain

[permalink] [raw]
Subject: [PATCH v5 3/3] video: fbdev: atyfb: use arch_phys_wc_add() and ioremap_wc()

From: "Luis R. Rodriguez" <[email protected]>

This driver uses strong UC for the MMIO region, and ioremap_wc()
for the framebuffer to whitelist for the WC MTRR what can be changed
to WC. On PAT systems we don't need the MTRR call so just use
arch_phys_wc_add() there, this lets us remove all those ifdefs.
Lets also be consistent and use ioremap_wc() for ATARI as well.

There are a few motivations for this:

a) Take advantage of PAT when available

b) Help bury MTRR code away, MTRR is architecture specific and on
x86 its replaced by PAT

c) Help with the goal of eventually using _PAGE_CACHE_UC over
_PAGE_CACHE_UC_MINUS on x86 on ioremap_nocache() (see commit
de33c442e titled "x86 PAT: fix performance drop for glx,
use UC minus for ioremap(), ioremap_nocache() and
pci_mmap_page_range()")

The conversion done is expressed by the following Coccinelle
SmPL patch, it additionally required manual intervention to
address all the #ifdery and removal of redundant things which
arch_phys_wc_add() already addresses such as verbose message
about when MTRR fails and doing nothing when we didn't get
an MTRR.

@ mtrr_found @
expression index, base, size;
@@

-index = mtrr_add(base, size, MTRR_TYPE_WRCOMB, 1);
+index = arch_phys_wc_add(base, size);

@ mtrr_rm depends on mtrr_found @
expression mtrr_found.index, mtrr_found.base, mtrr_found.size;
@@

-mtrr_del(index, base, size);
+arch_phys_wc_del(index);

@ mtrr_rm_zero_arg depends on mtrr_found @
expression mtrr_found.index;
@@

-mtrr_del(index, 0, 0);
+arch_phys_wc_del(index);

@ mtrr_rm_fb_info depends on mtrr_found @
struct fb_info *info;
expression mtrr_found.index;
@@

-mtrr_del(index, info->fix.smem_start, info->fix.smem_len);
+arch_phys_wc_del(index);

@ ioremap_replace_nocache depends on mtrr_found @
struct fb_info *info;
expression base, size;
@@

-info->screen_base = ioremap_nocache(base, size);
+info->screen_base = ioremap_wc(base, size);

@ ioremap_replace_default depends on mtrr_found @
struct fb_info *info;
expression base, size;
@@

-info->screen_base = ioremap(base, size);
+info->screen_base = ioremap_wc(base, size);

Generated-by: Coccinelle SmPL
Cc: Toshi Kani <[email protected]>
Cc: Suresh Siddha <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Dave Airlie <[email protected]>
Cc: Antonino Daplas <[email protected]>
Cc: Jean-Christophe Plagniol-Villard <[email protected]>
Cc: Tomi Valkeinen <[email protected]>
Cc: Ville Syrjälä <[email protected]>
Cc: Rob Clark <[email protected]>
Cc: Mathias Krause <[email protected]>
Cc: Andrzej Hajda <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Luis R. Rodriguez <[email protected]>
---
drivers/video/fbdev/aty/atyfb.h | 4 +---
drivers/video/fbdev/aty/atyfb_base.c | 36 +++++++-----------------------------
2 files changed, 8 insertions(+), 32 deletions(-)

diff --git a/drivers/video/fbdev/aty/atyfb.h b/drivers/video/fbdev/aty/atyfb.h
index 89ec439..63c4842 100644
--- a/drivers/video/fbdev/aty/atyfb.h
+++ b/drivers/video/fbdev/aty/atyfb.h
@@ -182,9 +182,7 @@ struct atyfb_par {
unsigned long irq_flags;
unsigned int irq;
spinlock_t int_lock;
-#ifdef CONFIG_MTRR
- int mtrr_aper;
-#endif
+ int wc_cookie;
u32 mem_cntl;
struct crtc saved_crtc;
union aty_pll saved_pll;
diff --git a/drivers/video/fbdev/aty/atyfb_base.c b/drivers/video/fbdev/aty/atyfb_base.c
index 546f5af..96c605c 100644
--- a/drivers/video/fbdev/aty/atyfb_base.c
+++ b/drivers/video/fbdev/aty/atyfb_base.c
@@ -98,9 +98,6 @@
#ifdef CONFIG_PMAC_BACKLIGHT
#include <asm/backlight.h>
#endif
-#ifdef CONFIG_MTRR
-#include <asm/mtrr.h>
-#endif

/*
* Debug flags.
@@ -303,9 +300,7 @@ static struct fb_ops atyfb_ops = {
};

static bool noaccel;
-#ifdef CONFIG_MTRR
static bool nomtrr;
-#endif
static int vram;
static int pll;
static int mclk;
@@ -2628,17 +2623,13 @@ static int aty_init(struct fb_info *info)
aty_st_le32(BUS_CNTL, aty_ld_le32(BUS_CNTL, par) |
BUS_APER_REG_DIS, par);

-#ifdef CONFIG_MTRR
- par->mtrr_aper = -1;
- if (!nomtrr) {
+ if (!nomtrr)
/*
* Only the ioremap_wc()'d area will get WC here
* since ioremap_uc() was used on the entire PCI BAR.
*/
- par->mtrr_aper = mtrr_add(par->res_start, par->res_size,
- MTRR_TYPE_WRCOMB, 1);
- }
-#endif
+ par->wc_cookie = arch_phys_wc_add(par->res_start,
+ par->res_size);

info->fbops = &atyfb_ops;
info->pseudo_palette = par->pseudo_palette;
@@ -2766,13 +2757,8 @@ aty_init_exit:
/* restore video mode */
aty_set_crtc(par, &par->saved_crtc);
par->pll_ops->set_pll(info, &par->saved_pll);
+ arch_phys_wc_del(par->wc_cookie);

-#ifdef CONFIG_MTRR
- if (par->mtrr_aper >= 0) {
- mtrr_del(par->mtrr_aper, 0, 0);
- par->mtrr_aper = -1;
- }
-#endif
return ret;
}

@@ -3660,7 +3646,8 @@ static int __init atyfb_atari_probe(void)
* Map the video memory (physical address given)
* to somewhere in the kernel address space.
*/
- info->screen_base = ioremap(phys_vmembase[m64_num], phys_size[m64_num]);
+ info->screen_base = ioremap_wc(phys_vmembase[m64_num],
+ phys_size[m64_num]);
info->fix.smem_start = (unsigned long)info->screen_base; /* Fake! */
par->ati_regbase = ioremap(phys_guiregbase[m64_num], 0x10000) +
0xFC00ul;
@@ -3726,13 +3713,8 @@ static void atyfb_remove(struct fb_info *info)
if (M64_HAS(MOBIL_BUS))
aty_bl_exit(info->bl_dev);
#endif
+ arch_phys_wc_del(par->wc_cookie);

-#ifdef CONFIG_MTRR
- if (par->mtrr_aper >= 0) {
- mtrr_del(par->mtrr_aper, 0, 0);
- par->mtrr_aper = -1;
- }
-#endif
#ifndef __sparc__
if (par->ati_regbase)
iounmap(par->ati_regbase);
@@ -3848,10 +3830,8 @@ static int __init atyfb_setup(char *options)
while ((this_opt = strsep(&options, ",")) != NULL) {
if (!strncmp(this_opt, "noaccel", 7)) {
noaccel = 1;
-#ifdef CONFIG_MTRR
} else if (!strncmp(this_opt, "nomtrr", 6)) {
nomtrr = 1;
-#endif
} else if (!strncmp(this_opt, "vram:", 5))
vram = simple_strtoul(this_opt + 5, NULL, 0);
else if (!strncmp(this_opt, "pll:", 4))
@@ -4021,7 +4001,5 @@ module_param(comp_sync, int, 0);
MODULE_PARM_DESC(comp_sync, "Set composite sync signal to low (0) or high (1)");
module_param(mode, charp, 0);
MODULE_PARM_DESC(mode, "Specify resolution as \"<xres>x<yres>[-<bpp>][@<refresh>]\" ");
-#ifdef CONFIG_MTRR
module_param(nomtrr, bool, 0);
MODULE_PARM_DESC(nomtrr, "bool: disable use of MTRR registers");
-#endif
--
2.3.2.209.gd67f9d5.dirty

2015-06-25 16:44:07

by Luis Chamberlain

[permalink] [raw]
Subject: Re: [PATCH v5 0/3] atyfb: address MTRR corner case

On Wed, Jun 24, 2015 at 06:34:17PM -0700, Luis R. Rodriguez wrote:
> From: "Luis R. Rodriguez" <[email protected]>
>
> Andrew,

Andrew, as Ingo noted please disregard these patches as it seems we'll be
preferring for this to go through the x86 tree.

Luis

2015-06-25 20:48:29

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v5 0/3] atyfb: address MTRR corner case

On Wed, Jun 24, 2015 at 06:34:17PM -0700, Luis R. Rodriguez wrote:
> Luis R. Rodriguez (3):
> video: fbdev: atyfb: clarify ioremap() base and length used
> video: fbdev: atyfb: replace MTRR UC hole with strong UC
> video: fbdev: atyfb: use arch_phys_wc_add() and ioremap_wc()
>
> drivers/video/fbdev/aty/atyfb.h | 5 +--
> drivers/video/fbdev/aty/atyfb_base.c | 74 +++++++++++-------------------------
> 2 files changed, 24 insertions(+), 55 deletions(-)

Took those too along with

"[PATCH] video: fbdev: atyfb: move framebuffer length fudging to helper"

which was missing here.

--
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--

2015-06-25 23:04:56

by Ville Syrjälä

[permalink] [raw]
Subject: Re: [PATCH v5 1/3] video: fbdev: atyfb: clarify ioremap() base and length used

On Wed, Jun 24, 2015 at 06:34:18PM -0700, Luis R. Rodriguez wrote:
> From: "Luis R. Rodriguez" <[email protected]>
>
> This has no functional changes, it just adjusts
> the ioremap() call for the framebuffer to use
> the same values we later use for the framebuffer,
> this will make it easier to review the next change.
>
> The size of the framebuffer varies but since this is
> for PCI we *know* this defaults to 0x800000.
> atyfb_setup_generic() is *only* used on PCI probe.
>
> Cc: Toshi Kani <[email protected]>
> Cc: Suresh Siddha <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Linus Torvalds <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Cc: Juergen Gross <[email protected]>
> Cc: Daniel Vetter <[email protected]>
> Cc: Andy Lutomirski <[email protected]>
> Cc: Dave Airlie <[email protected]>
> Cc: Antonino Daplas <[email protected]>
> Cc: Jean-Christophe Plagniol-Villard <[email protected]>
> Cc: Tomi Valkeinen <[email protected]>
> Cc: Ville Syrj?l? <[email protected]>
> Cc: Rob Clark <[email protected]>
> Cc: Mathias Krause <[email protected]>
> Cc: Andrzej Hajda <[email protected]>
> Cc: Mel Gorman <[email protected]>
> Cc: Vlastimil Babka <[email protected]>
> Cc: Borislav Petkov <[email protected]>
> Cc: Davidlohr Bueso <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Signed-off-by: Luis R. Rodriguez <[email protected]>
> ---
> drivers/video/fbdev/aty/atyfb_base.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/video/fbdev/aty/atyfb_base.c b/drivers/video/fbdev/aty/atyfb_base.c
> index 16936bb..8025624 100644
> --- a/drivers/video/fbdev/aty/atyfb_base.c
> +++ b/drivers/video/fbdev/aty/atyfb_base.c
> @@ -3489,7 +3489,9 @@ static int atyfb_setup_generic(struct pci_dev *pdev, struct fb_info *info,
>
> /* Map in frame buffer */
> info->fix.smem_start = addr;
> - info->screen_base = ioremap(addr, 0x800000);
> + info->fix.smem_len = 0x800000;
> +
> + info->screen_base = ioremap(info->fix.smem_start, info->fix.smem_len);

The framebuffer size isn't always 8MB. That's the size of the BAR. So
this change isn't really correct. I suppose it doesn't hurt too much
since smem_len gets overwritten later in aty_init().

> if (info->screen_base == NULL) {
> ret = -ENOMEM;
> goto atyfb_setup_generic_fail;
> --
> 2.3.2.209.gd67f9d5.dirty
>

--
Ville Syrj?l?
[email protected]
http://www.sci.fi/~syrjala/

2015-06-25 23:07:12

by Luis Chamberlain

[permalink] [raw]
Subject: Re: [PATCH v5 1/3] video: fbdev: atyfb: clarify ioremap() base and length used

On Thu, Jun 25, 2015 at 4:04 PM, Ville Syrjälä <[email protected]> wrote:
> it doesn't hurt too much
> since smem_len gets overwritten later in aty_init().

That's the idea, we set it with a default as it will be overwritten
later anyway.

Luis

2015-06-25 23:17:18

by Ville Syrjälä

[permalink] [raw]
Subject: Re: [PATCH v5 1/3] video: fbdev: atyfb: clarify ioremap() base and length used

On Thu, Jun 25, 2015 at 04:06:45PM -0700, Luis R. Rodriguez wrote:
> On Thu, Jun 25, 2015 at 4:04 PM, Ville Syrj?l? <[email protected]> wrote:
> > it doesn't hurt too much
> > since smem_len gets overwritten later in aty_init().
>
> That's the idea, we set it with a default as it will be overwritten
> later anyway.

Maybe toss in a comment? Otherwise it's a bit dishonest and might give
someone the impression that all PCI cards really have 8MB of memory.

--
Ville Syrj?l?
[email protected]
http://www.sci.fi/~syrjala/

2015-06-26 01:09:42

by Luis Chamberlain

[permalink] [raw]
Subject: Re: [PATCH v5 1/3] video: fbdev: atyfb: clarify ioremap() base and length used

On Fri, Jun 26, 2015 at 02:11:03AM +0300, Ville Syrj?l? wrote:
> On Thu, Jun 25, 2015 at 04:06:45PM -0700, Luis R. Rodriguez wrote:
> > On Thu, Jun 25, 2015 at 4:04 PM, Ville Syrj?l? <[email protected]> wrote:
> > > it doesn't hurt too much
> > > since smem_len gets overwritten later in aty_init().
> >
> > That's the idea, we set it with a default as it will be overwritten
> > later anyway.
>
> Maybe toss in a comment? Otherwise it's a bit dishonest and might give
> someone the impression that all PCI cards really have 8MB of memory.

Sure, mind this as a follow up patch if its too late?

Luis

2015-06-26 07:30:55

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v5 1/3] video: fbdev: atyfb: clarify ioremap() base and length used

On Fri, Jun 26, 2015 at 03:09:27AM +0200, Luis R. Rodriguez wrote:
> Sure, mind this as a follow up patch if its too late?

No need, you can send me an updated one - I'll replace it.

Thanks.

--
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--

2015-07-02 23:23:54

by Luis Chamberlain

[permalink] [raw]
Subject: Re: [PATCH v5 1/3] video: fbdev: atyfb: clarify ioremap() base and length used

On Fri, Jun 26, 2015 at 12:30 AM, Borislav Petkov <[email protected]> wrote:
> On Fri, Jun 26, 2015 at 03:09:27AM +0200, Luis R. Rodriguez wrote:
>> Sure, mind this as a follow up patch if its too late?
>
> No need, you can send me an updated one - I'll replace it.

Will do!

Luis

2015-07-08 00:25:26

by Luis Chamberlain

[permalink] [raw]
Subject: Re: [PATCH v5 1/3] video: fbdev: atyfb: clarify ioremap() base and length used

On Thu, Jul 2, 2015 at 4:23 PM, Luis R. Rodriguez <[email protected]> wrote:
> On Fri, Jun 26, 2015 at 12:30 AM, Borislav Petkov <[email protected]> wrote:
>> On Fri, Jun 26, 2015 at 03:09:27AM +0200, Luis R. Rodriguez wrote:
>>> Sure, mind this as a follow up patch if its too late?
>>
>> No need, you can send me an updated one - I'll replace it.
>
> Will do!

OK the commend I'm adding:

@@ -3489,6 +3489,15 @@ static int atyfb_setup_generic(struct pci_dev
*pdev, struct fb_info *info,

/* Map in frame buffer */
info->fix.smem_start = addr;
+
+ /*
+ * The framebuffer is not always 8 MiB that's just the size of the
+ * PCI BAR, this is later corrected for use with write-combining
+ * helpers with aty_fudge_framebuffer_len() which will adjust the
+ * framebuffer accordingly depending on the device. We do this
+ * to match semantics over ioremap calls on framebuffer devices
+ * with with other drivers with the info->fix.smem_len.
+ */
info->fix.smem_len = 0x800000;

info->screen_base = ioremap(info->fix.smem_start, info->fix.smem_len);

Will respin.

Luis

2015-07-08 08:39:15

by Ville Syrjälä

[permalink] [raw]
Subject: Re: [PATCH v5 1/3] video: fbdev: atyfb: clarify ioremap() base and length used

On Tue, Jul 07, 2015 at 05:24:57PM -0700, Luis R. Rodriguez wrote:
> On Thu, Jul 2, 2015 at 4:23 PM, Luis R. Rodriguez <[email protected]> wrote:
> > On Fri, Jun 26, 2015 at 12:30 AM, Borislav Petkov <[email protected]> wrote:
> >> On Fri, Jun 26, 2015 at 03:09:27AM +0200, Luis R. Rodriguez wrote:
> >>> Sure, mind this as a follow up patch if its too late?
> >>
> >> No need, you can send me an updated one - I'll replace it.
> >
> > Will do!
>
> OK the commend I'm adding:
>
> @@ -3489,6 +3489,15 @@ static int atyfb_setup_generic(struct pci_dev
> *pdev, struct fb_info *info,
>
> /* Map in frame buffer */
> info->fix.smem_start = addr;
> +
> + /*
> + * The framebuffer is not always 8 MiB that's just the size of the
> + * PCI BAR, this is later corrected for use with write-combining
> + * helpers with aty_fudge_framebuffer_len() which will adjust the
> + * framebuffer accordingly depending on the device.

That somehow gives me the impression that aty_fudge_framebuffer_len()
changes smem_len to match the framebuffer size, which it does
not.

Dunno, maybe something like this?
/*
* The framebuffer is not always 8 MiB that's just the size of the
* PCI BAR. We temporarily abuse smem_len here to store the size
* of the BAR. aty_init() will later correct it to match the actual
* framebuffer size.
*
* On devices that don't have the auxiliary register aperture, the
* registers are housed at the top end of the framebuffer PCI BAR.
* aty_fudge_framebuffer_len() is used to reduce smem_len to not
* overlap with the registers.
*/

> We do this
> + * to match semantics over ioremap calls on framebuffer devices
> + * with with other drivers with the info->fix.smem_len.
> + */
> info->fix.smem_len = 0x800000;
>
> info->screen_base = ioremap(info->fix.smem_start, info->fix.smem_len);
>
> Will respin.
>
> Luis

--
Ville Syrj?l?
[email protected]
http://www.sci.fi/~syrjala/

2015-07-09 17:25:19

by Luis Chamberlain

[permalink] [raw]
Subject: Re: [PATCH v5 1/3] video: fbdev: atyfb: clarify ioremap() base and length used

On Wed, Jul 08, 2015 at 11:38:49AM +0300, Ville Syrj?l? wrote:
> On Tue, Jul 07, 2015 at 05:24:57PM -0700, Luis R. Rodriguez wrote:
> > On Thu, Jul 2, 2015 at 4:23 PM, Luis R. Rodriguez <[email protected]> wrote:
> > > On Fri, Jun 26, 2015 at 12:30 AM, Borislav Petkov <[email protected]> wrote:
> > >> On Fri, Jun 26, 2015 at 03:09:27AM +0200, Luis R. Rodriguez wrote:
> > >>> Sure, mind this as a follow up patch if its too late?
> > >>
> > >> No need, you can send me an updated one - I'll replace it.
> > >
> > > Will do!
> >
> > OK the commend I'm adding:
> >
> > @@ -3489,6 +3489,15 @@ static int atyfb_setup_generic(struct pci_dev
> > *pdev, struct fb_info *info,
> >
> > /* Map in frame buffer */
> > info->fix.smem_start = addr;
> > +
> > + /*
> > + * The framebuffer is not always 8 MiB that's just the size of the
> > + * PCI BAR, this is later corrected for use with write-combining
> > + * helpers with aty_fudge_framebuffer_len() which will adjust the
> > + * framebuffer accordingly depending on the device.
>
> That somehow gives me the impression that aty_fudge_framebuffer_len()
> changes smem_len to match the framebuffer size, which it does
> not.
>
> Dunno, maybe something like this?
> /*
> * The framebuffer is not always 8 MiB that's just the size of the
> * PCI BAR. We temporarily abuse smem_len here to store the size
> * of the BAR. aty_init() will later correct it to match the actual
> * framebuffer size.
> *
> * On devices that don't have the auxiliary register aperture, the
> * registers are housed at the top end of the framebuffer PCI BAR.
> * aty_fudge_framebuffer_len() is used to reduce smem_len to not
> * overlap with the registers.
> */

Thanks Ville, I used that. Will send out a v6 series.

Luis