2010-11-27 13:37:44

by Thomas Schlichter

[permalink] [raw]
Subject: [PATCH] uvesafb,vesafb: create write-combining or write-back PAT entries

Hi,

with an PAT-enabled kernel (tested with 2.6.35, but the problem is still
present in current -tip), when using uvesafb or vesafb, these drivers will
create uncached-minus PAT entries for the framebuffer memory because they use
ioremap() (not the *_cached or *_wc variants). When the framebuffer memory
intersects with the video RAM used by Xorg, the complete video RAM will be
mapped uncached-minus what results in a serve performance penalty.

Here are the correct MTRR entries created by uvesafb:
schlicht@netbook:~$ cat /proc/mtrr
reg00: base=0x000000000 ( 0MB), size= 2048MB, count=1: write-back
reg01: base=0x06ff00000 ( 1791MB), size= 1MB, count=1: uncachable
reg02: base=0x070000000 ( 1792MB), size= 256MB, count=1: uncachable
reg03: base=0x0d0000000 ( 3328MB), size= 16MB, count=1: write-combining

And here are the problematic PAT entries:
schlicht@netbook:~$ sudo cat /sys/kernel/debug/x86/pat_memtype_list
PAT memtype list:
write-back @ 0x0-0x1000
uncached-minus @ 0x6fedd000-0x6fee3000
uncached-minus @ 0x6fee2000-0x6fee3000
uncached-minus @ 0x6fee2000-0x6fee3000
uncached-minus @ 0x6fee2000-0x6fee3000
uncached-minus @ 0x6fee2000-0x6fee3000
uncached-minus @ 0x6fee2000-0x6fee3000
uncached-minus @ 0x6fee2000-0x6fee3000
uncached-minus @ 0x6fee2000-0x6fee3000
uncached-minus @ 0x6fee3000-0x6fee4000
uncached-minus @ 0x6fee3000-0x6fee4000
uncached-minus @ 0x6fee3000-0x6fee4000
uncached-minus @ 0xd0000000-0xe0000000 <-- created by xserver-xorg
uncached-minus @ 0xd0000000-0xd1194000 <-- created by uvesafb
uncached-minus @ 0xf4000000-0xf4009000
uncached-minus @ 0xf4200000-0xf4400000
uncached-minus @ 0xf5000000-0xf5010000
uncached-minus @ 0xf5100000-0xf5104000
uncached-minus @ 0xf5400000-0xf5404000
uncached-minus @ 0xf5404000-0xf5405000
uncached-minus @ 0xf5404000-0xf5405000
uncached-minus @ 0xfed00000-0xfed01000

Therefore I created the attached patch for uvesafb which uses ioremap_wc() to
create the correct PAT entries, as shown below:
schlicht@netbook:~$ sudo cat /sys/kernel/debug/x86/pat_memtype_list
PAT memtype list:
write-back @ 0x0-0x1000
uncached-minus @ 0x6fedd000-0x6fee3000
uncached-minus @ 0x6fee2000-0x6fee3000
uncached-minus @ 0x6fee2000-0x6fee3000
uncached-minus @ 0x6fee2000-0x6fee3000
uncached-minus @ 0x6fee2000-0x6fee3000
uncached-minus @ 0x6fee2000-0x6fee3000
uncached-minus @ 0x6fee2000-0x6fee3000
uncached-minus @ 0x6fee2000-0x6fee3000
uncached-minus @ 0x6fee3000-0x6fee4000
uncached-minus @ 0x6fee3000-0x6fee4000
uncached-minus @ 0x6fee3000-0x6fee4000
write-combining @ 0xd0000000-0xe0000000
write-combining @ 0xd0000000-0xd1194000
uncached-minus @ 0xf4000000-0xf4009000
uncached-minus @ 0xf4200000-0xf4400000
uncached-minus @ 0xf5000000-0xf5010000
uncached-minus @ 0xf5100000-0xf5104000
uncached-minus @ 0xf5400000-0xf5404000
uncached-minus @ 0xf5404000-0xf5405000
uncached-minus @ 0xf5404000-0xf5405000
uncached-minus @ 0xfed00000-0xfed01000

This results in a performance gain, objectively measurable with e.g.
x11perf -comppixwin10 -comppixwin100 -comppixwin500:
1: x11perf_xaa.log
2: x11perf_xaa_patched.log

1 2 Operation
-------- ----------------- -----------------
124000.0 202000.0 ( 1.63) Composite 10x10 from pixmap to window
3340.0 24400.0 ( 7.31) Composite 100x100 from pixmap to window
131.0 1150.0 ( 8.78) Composite 500x500 from pixmap to window

You can see the serve performance gain when composing larger pixmaps to
window.
Please consider applying the attached patches for uvesafb and vesafb.

These patches (for -tip) replace the ioremap() function with the variants
matching the mtrr-parameter. To create "write-back" PAT entries, the
ioremap_cache() function must be called after creating the MTRR entries, and
the ioremap_cache() region must completely fit into the MTRR region, this is
why the MTRR region size is now rounded up to the next power-of-two.

Signed-off-by: Thomas Schlichter <[email protected]>

Thank you very much!

Kind regards,
Thomas


Attachments:
vesafb.diff (2.36 kB)
uvesafb.diff (2.54 kB)
Download all attachments

2010-11-30 06:11:50

by Paul Mundt

[permalink] [raw]
Subject: Re: [PATCH] uvesafb,vesafb: create write-combining or write-back PAT entries

On Sat, Nov 27, 2010 at 02:37:37PM +0100, Thomas Schlichter wrote:
> diff --git a/drivers/video/uvesafb.c b/drivers/video/uvesafb.c
> index 52ec095..5a34bc8 100644
> --- a/drivers/video/uvesafb.c
> +++ b/drivers/video/uvesafb.c
...

> - info->screen_base = ioremap(info->fix.smem_start, info->fix.smem_len);
> + uvesafb_init_mtrr(info);
> +
> + switch (mtrr) {
> + case 1: /* uncachable */
> + info->screen_base = ioremap_nocache(info->fix.smem_start, info->fix.smem_len);
> + break;
> + case 2: /* write-back */
> + info->screen_base = ioremap_cache(info->fix.smem_start, info->fix.smem_len);
> + break;
> + case 3: /* write-combining */
> + info->screen_base = ioremap_wc(info->fix.smem_start, info->fix.smem_len);
> + break;
> + case 4: /* write-through */
> + default:
> + info->screen_base = ioremap(info->fix.smem_start, info->fix.smem_len);
> + break;
> + }
>
> if (!info->screen_base) {
> printk(KERN_ERR

uvesafb presently has no special architecture dependencies, but
ioremap_wc() is not as of yet a wholly generic interface. Some
architectures that don't set ARCH_HAS_IOREMAP_WC get it by virtue of the
asm-generic/iomap.h include, and most of the nommu architectures will
stub it in via asm-generic/io.h, but this still leaves quite a long list
of platforms that don't handle it at all.

Your options at this point are either to establish ioremap_wc() as a
generic API, and layer this patch on top of that, or rework
uvesafb_init_mtrr() to do the actual broken-out remapping and rework it
in to something like a uvesafb_remap(), where you bury the MTRR details
under CONFIG_MTRR.

While there's probably value in exposing ioremap_wc() as a generic
interface, you're never going to hit any of the non-default ioremap()
calls on platforms lacking MTRRs anyways, so special-casing it is ok.

2011-02-06 11:15:38

by Thomas Schlichter

[permalink] [raw]
Subject: Re: [PATCH] uvesafb,vesafb: create write-combining or write-back PAT entries

Sorry for answering this late, unfortunately I was quite busy with my daily
work...

Am Dienstag 30 November 2010, um 07:10:51 schrieb Paul Mundt:
> uvesafb presently has no special architecture dependencies, but
> ioremap_wc() is not as of yet a wholly generic interface. Some
> architectures that don't set ARCH_HAS_IOREMAP_WC get it by virtue of the
> asm-generic/iomap.h include, and most of the nommu architectures will
> stub it in via asm-generic/io.h, but this still leaves quite a long list
> of platforms that don't handle it at all.
>
> Your options at this point are either to establish ioremap_wc() as a
> generic API, and layer this patch on top of that, or rework
> uvesafb_init_mtrr() to do the actual broken-out remapping and rework it
> in to something like a uvesafb_remap(), where you bury the MTRR details
> under CONFIG_MTRR.
>
> While there's probably value in exposing ioremap_wc() as a generic
> interface, you're never going to hit any of the non-default ioremap()
> calls on platforms lacking MTRRs anyways, so special-casing it is ok.

Thank you for finding that problem and showing possibilities for fixing it. I
prepared 3 patches, where the first essentially is my old patch with special-
casing via ifdef CONFIG_X86. The second patch establishes ioremap_cache() and
ioremap_wc() for all architectures, and the third patch removed the special-
casing from uvesafb.

So the first patch can stand on its own and hopefully can be merged upstream.
The third patch needs the second one, which may be merged for unifying
purposes. But as these are mor cleanup-patches, they are not that important
for me.

Best regards,
Thomas Schlichter


Attachments:
signature.asc (198.00 B)
This is a digitally signed message part.

2011-02-06 12:34:52

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: [PATCH 2/3] Add ioremap_cache and ioremap_wc to all architectures

This really should go to linux-arch...

On Sun, Feb 6, 2011 at 12:12, Thomas Schlichter
<[email protected]> wrote:
>
>

Oops, no inline patch. Well, people know they can find the attachment
on lkml ;-)

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds