2018-11-01 15:29:08

by Jean Delvare

[permalink] [raw]
Subject: Performance regression in ast drm driver

Hi David,

The following commit:

commit 7cf321d118a825c1541b43ca45294126fd474efa
Author: Dave Airlie
Date: Mon Oct 24 15:37:48 2016 +1000

drm/drivers: add support for using the arch wc mapping API.

is causing a huge performance regression for the ast drm driver. In a
text console, if I call "cat" on a large text file, it takes almost
twice as much time to be displayed and scrolled completely.

Can you please check that the ast driver portion of that commit is both
correct and complete?

Thanks,
--
Jean Delvare
SUSE L3 Support


2018-11-08 12:06:53

by Jean Delvare

[permalink] [raw]
Subject: Re: Performance regression in ast drm driver

On Thu, 1 Nov 2018 16:27:07 +0100, Jean Delvare wrote:
> Hi David,
>
> The following commit:
>
> commit 7cf321d118a825c1541b43ca45294126fd474efa
> Author: Dave Airlie
> Date: Mon Oct 24 15:37:48 2016 +1000
>
> drm/drivers: add support for using the arch wc mapping API.
>
> is causing a huge performance regression for the ast drm driver. In a
> text console, if I call "cat" on a large text file, it takes almost
> twice as much time to be displayed and scrolled completely.
>
> Can you please check that the ast driver portion of that commit is both
> correct and complete?

And in the meantime, what bad will happen if we just revert the ast
portion of that commit?

Thanks,
--
Jean Delvare
SUSE L3 Support

2018-11-09 00:06:10

by David Airlie

[permalink] [raw]
Subject: Re: Performance regression in ast drm driver

On Thu, Nov 8, 2018 at 10:05 PM Jean Delvare <[email protected]> wrote:
>
> On Thu, 1 Nov 2018 16:27:07 +0100, Jean Delvare wrote:
> > Hi David,
> >
> > The following commit:
> >
> > commit 7cf321d118a825c1541b43ca45294126fd474efa
> > Author: Dave Airlie
> > Date: Mon Oct 24 15:37:48 2016 +1000
> >
> > drm/drivers: add support for using the arch wc mapping API.
> >
> > is causing a huge performance regression for the ast drm driver. In a
> > text console, if I call "cat" on a large text file, it takes almost
> > twice as much time to be displayed and scrolled completely.
> >
> > Can you please check that the ast driver portion of that commit is both
> > correct and complete?
>
> And in the meantime, what bad will happen if we just revert the ast
> portion of that commit?
>

This seems likely to be a hw problem with PCI writes to the AST "GPU",
since it's just some sort of RAM + ARM on the end of a PCIE bus, we've
definitely seen possible issues in the past with write combining
around some of the mga GPUs with some CPUs.

Have we seen the problem across a number of AST devices?

Dave.

2018-11-10 08:39:58

by Jean Delvare

[permalink] [raw]
Subject: Re: Performance regression in ast drm driver

On Fri, 9 Nov 2018 10:04:03 +1000, David Airlie wrote:
> This seems likely to be a hw problem with PCI writes to the AST "GPU",
> since it's just some sort of RAM + ARM on the end of a PCIE bus, we've
> definitely seen possible issues in the past with write combining
> around some of the mga GPUs with some CPUs.
>
> Have we seen the problem across a number of AST devices?

The reports I received from customers were all on AST 2500 devices (on
Supermicro X11DPi-N, Supermicro X11DPH-T and Asus WS C621 Sage).

I was able to reproduce "the problem" on my old Asus Z6NA-D6 which has
either an AST 2050 device if I trust the board specifications on
asus.com, or an AST 1100 if I trust /var/log/Xorg.0.log.

Note however that I am still not certain that the problem I am seeing
is the same as what both customers reported. It is possible that we
have 2 different issues.

--
Jean Delvare
SUSE L3 Support

2018-11-12 14:37:25

by Jean Delvare

[permalink] [raw]
Subject: Re: Performance regression in ast drm driver

Hi David,

On Fri, 2018-11-09 at 10:04 +1000, David Airlie wrote:
> On Thu, Nov 8, 2018 at 10:05 PM Jean Delvare <[email protected]> wrote:
> >
> > On Thu, 1 Nov 2018 16:27:07 +0100, Jean Delvare wrote:
> > > Hi David,
> > >
> > > The following commit:
> > >
> > > commit 7cf321d118a825c1541b43ca45294126fd474efa
> > > Author: Dave Airlie
> > > Date: Mon Oct 24 15:37:48 2016 +1000
> > >
> > > drm/drivers: add support for using the arch wc mapping API.
> > >
> > > is causing a huge performance regression for the ast drm driver. In a
> > > text console, if I call "cat" on a large text file, it takes almost
> > > twice as much time to be displayed and scrolled completely.
> > >
> > > Can you please check that the ast driver portion of that commit is both
> > > correct and complete?
> >
> > And in the meantime, what bad will happen if we just revert the ast
> > portion of that commit?
>
> This seems likely to be a hw problem with PCI writes to the AST "GPU",
> since it's just some sort of RAM + ARM on the end of a PCIE bus, we've
> definitely seen possible issues in the past with write combining
> around some of the mga GPUs with some CPUs.

Takashi asked me to compare the contents of
/sys/kernel/debug/x86/pat_memtype_list before and after the commit
above. Before the commit, we have:

uncached-minus @ 0xfafe0000-0xfb000000
uncached-minus @ 0xfb000000-0xfb005000
write-combining @ 0xfb005000-0xfb584000

After the commit, we have:

uncached-minus @ 0xfafe0000-0xfb000000
uncached-minus @ 0xfb000000-0xfb005000
uncached-minus @ 0xfb000000-0xfb800000
uncached-minus @ 0xfb005000-0xfb584000

The corresponding lines in /proc/iomem are:

f0000000-fed8ffff : PCI Bus 0000:00
  faf00000-fb7fffff : PCI Bus 0000:01
    fafe0000-faffffff : 0000:01:01.0
    fb000000-fb7fffff : 0000:01:01.0

Does it help? Is the change of type expected? Is it not a problem that
one of the ranges is overlapping with 2 other ranges?

--
Jean Delvare
SUSE L3 Support

2018-11-12 14:46:10

by Takashi Iwai

[permalink] [raw]
Subject: Re: Performance regression in ast drm driver

On Mon, 12 Nov 2018 15:36:07 +0100,
Jean Delvare wrote:
>
> Hi David,
>
> On Fri, 2018-11-09 at 10:04 +1000, David Airlie wrote:
> > On Thu, Nov 8, 2018 at 10:05 PM Jean Delvare <[email protected]> wrote:
> > >
> > > On Thu, 1 Nov 2018 16:27:07 +0100, Jean Delvare wrote:
> > > > Hi David,
> > > >
> > > > The following commit:
> > > >
> > > > commit 7cf321d118a825c1541b43ca45294126fd474efa
> > > > Author: Dave Airlie
> > > > Date: Mon Oct 24 15:37:48 2016 +1000
> > > >
> > > > drm/drivers: add support for using the arch wc mapping API.
> > > >
> > > > is causing a huge performance regression for the ast drm driver. In a
> > > > text console, if I call "cat" on a large text file, it takes almost
> > > > twice as much time to be displayed and scrolled completely.
> > > >
> > > > Can you please check that the ast driver portion of that commit is both
> > > > correct and complete?
> > >
> > > And in the meantime, what bad will happen if we just revert the ast
> > > portion of that commit?
> >
> > This seems likely to be a hw problem with PCI writes to the AST "GPU",
> > since it's just some sort of RAM + ARM on the end of a PCIE bus, we've
> > definitely seen possible issues in the past with write combining
> > around some of the mga GPUs with some CPUs.
>
> Takashi asked me to compare the contents of
> /sys/kernel/debug/x86/pat_memtype_list before and after the commit
> above. Before the commit, we have:
>
> uncached-minus @ 0xfafe0000-0xfb000000
> uncached-minus @ 0xfb000000-0xfb005000
> write-combining @ 0xfb005000-0xfb584000
>
> After the commit, we have:
>
> uncached-minus @ 0xfafe0000-0xfb000000
> uncached-minus @ 0xfb000000-0xfb005000
> uncached-minus @ 0xfb000000-0xfb800000
> uncached-minus @ 0xfb005000-0xfb584000

Just to be sure:
could you double-check whether you're checking the right order
(i.e. not checking against the revert)? The change above looks
illogical from what I can see from the commit...


thanks,

Takashi

>
> The corresponding lines in /proc/iomem are:
>
> f0000000-fed8ffff : PCI Bus 0000:00
>   faf00000-fb7fffff : PCI Bus 0000:01
>     fafe0000-faffffff : 0000:01:01.0
>     fb000000-fb7fffff : 0000:01:01.0
>
> Does it help? Is the change of type expected? Is it not a problem that
> one of the ranges is overlapping with 2 other ranges?
>
> --
> Jean Delvare
> SUSE L3 Support
>

2018-11-12 16:44:02

by Jean Delvare

[permalink] [raw]
Subject: Re: Performance regression in ast drm driver

On Mon, 2018-11-12 at 15:45 +0100, Takashi Iwai wrote:
> On Mon, 12 Nov 2018 15:36:07 +0100,
> Jean Delvare wrote:
> > Takashi asked me to compare the contents of
> > /sys/kernel/debug/x86/pat_memtype_list before and after the commit
> > above. Before the commit, we have:
> >
> > uncached-minus @ 0xfafe0000-0xfb000000
> > uncached-minus @ 0xfb000000-0xfb005000
> > write-combining @ 0xfb005000-0xfb584000
> >
> > After the commit, we have:
> >
> > uncached-minus @ 0xfafe0000-0xfb000000
> > uncached-minus @ 0xfb000000-0xfb005000
> > uncached-minus @ 0xfb000000-0xfb800000
> > uncached-minus @ 0xfb005000-0xfb584000
>
> Just to be sure:
> could you double-check whether you're checking the right order
> (i.e. not checking against the revert)? The change above looks
> illogical from what I can see from the commit...

Yes, I double checked and can only confirm what I wrote above. However
while checking I noticed another strange thing: the contents of
pat_memtype_list depend on whether I'm reading the file from an xterm
or from the text console (Ctrl+Alt+F2). So I summarize again:

Commit "drm/drivers: add support for using the arch wc mapping API"
APPLIED, reading pat_memtype_list from the text console:

uncached-minus @ 0xfafe0000-0xfb000000
uncached-minus @ 0xfb000000-0xfb005000
uncached-minus @ 0xfb000000-0xfb800000
uncached-minus @ 0xfb005000-0xfb584000

Commit "drm/drivers: add support for using the arch wc mapping API"
APPLIED, reading pat_memtype_list from an xterm:

uncached-minus @ 0xfafe0000-0xfb000000
uncached-minus @ 0xfb000000-0xfb005000
uncached-minus @ 0xfb000000-0xfb800000

Commit "drm/drivers: add support for using the arch wc mapping API"
REVERTED, reading pat_memtype_list from the text console:

uncached-minus @ 0xfafe0000-0xfb000000
uncached-minus @ 0xfb000000-0xfb005000
write-combining @ 0xfb005000-0xfb584000

Commit "drm/drivers: add support for using the arch wc mapping API"
REVERTED, reading pat_memtype_list from an xterm:

uncached-minus @ 0xfafe0000-0xfb000000
uncached-minus @ 0xfb000000-0xfb005000

If you don't find it logical, remember that if things were the way they
are supposed to be, customers and myself would not be reporting this
bug ;-)

--
Jean Delvare
SUSE L3 Support

2018-11-12 19:37:58

by Thomas Zimmermann

[permalink] [raw]
Subject: Re: Performance regression in ast drm driver

Hi Jean

Am 12.11.18 um 15:36 schrieb Jean Delvare:
> Hi David,
>
> On Fri, 2018-11-09 at 10:04 +1000, David Airlie wrote:
>> On Thu, Nov 8, 2018 at 10:05 PM Jean Delvare <[email protected]> wrote:
>>>
>>> On Thu, 1 Nov 2018 16:27:07 +0100, Jean Delvare wrote:
>>>> Hi David,
>>>>
>>>> The following commit:
>>>>
>>>> commit 7cf321d118a825c1541b43ca45294126fd474efa
>>>> Author: Dave Airlie
>>>> Date: Mon Oct 24 15:37:48 2016 +1000
>>>>
>>>> drm/drivers: add support for using the arch wc mapping API.
>>>>
>>>> is causing a huge performance regression for the ast drm driver. In a
>>>> text console, if I call "cat" on a large text file, it takes almost
>>>> twice as much time to be displayed and scrolled completely.
>>>>
>>>> Can you please check that the ast driver portion of that commit is both
>>>> correct and complete?
>>>
>>> And in the meantime, what bad will happen if we just revert the ast
>>> portion of that commit?
>>
>> This seems likely to be a hw problem with PCI writes to the AST "GPU",
>> since it's just some sort of RAM + ARM on the end of a PCIE bus, we've
>> definitely seen possible issues in the past with write combining
>> around some of the mga GPUs with some CPUs.
>
> Takashi asked me to compare the contents of
> /sys/kernel/debug/x86/pat_memtype_list before and after the commit
> above. Before the commit, we have:
>
> uncached-minus @ 0xfafe0000-0xfb000000
> uncached-minus @ 0xfb000000-0xfb005000
> write-combining @ 0xfb005000-0xfb584000
>
> After the commit, we have:
>
> uncached-minus @ 0xfafe0000-0xfb000000
> uncached-minus @ 0xfb000000-0xfb005000
> uncached-minus @ 0xfb000000-0xfb800000
> uncached-minus @ 0xfb005000-0xfb584000
>
> The corresponding lines in /proc/iomem are:
>
> f0000000-fed8ffff : PCI Bus 0000:00
>   faf00000-fb7fffff : PCI Bus 0000:01
>     fafe0000-faffffff : 0000:01:01.0
>     fb000000-fb7fffff : 0000:01:01.0
>
> Does it help? Is the change of type expected? Is it not a problem that
> one of the ranges is overlapping with 2 other ranges?
>

I debugged this problem with an AST 2000. What happens is that the
vesafb driver attaches first to the graphics device. It disables
write-combining and falls back to uncached access; unless the kernel has
been booted with video=vesafb:mtrr:3. Then it will set-up WC correctly.
The respective code is at [1].

Best regards
Thomas

[1]
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/video/fbdev/vesafb.c?h=v4.20-rc2#n415

--
Thomas Zimmermann
Graphics Driver Developer
SUSE Linux GmbH, Maxfeldstr. 5, D-90409 Nürnberg
Tel: +49-911-74053-0; Fax: +49-911-7417755; https://www.suse.com/
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard,
Graham Norton, HRB 21284 (AG Nürnberg)


Attachments:
signature.asc (499.00 B)
OpenPGP digital signature

2018-11-13 08:39:39

by YC Chen

[permalink] [raw]
Subject: RE: Performance regression in ast drm driver

Sir,
We found the performance will be bad on desktop environment with OpenSUSE15 UEFI installation if remove "arch_io_reserve_memtype_wc".

Regards,

Y.C. Chen

-----Original Message-----
From: dri-devel [mailto:[email protected]] On Behalf Of Jean Delvare
Sent: Thursday, November 08, 2018 8:05 PM
To: Dave Airlie <[email protected]>
Cc: Christian König <[email protected]>; [email protected]; [email protected]
Subject: Re: Performance regression in ast drm driver

On Thu, 1 Nov 2018 16:27:07 +0100, Jean Delvare wrote:
> Hi David,
>
> The following commit:
>
> commit 7cf321d118a825c1541b43ca45294126fd474efa
> Author: Dave Airlie
> Date: Mon Oct 24 15:37:48 2016 +1000
>
> drm/drivers: add support for using the arch wc mapping API.
>
> is causing a huge performance regression for the ast drm driver. In a
> text console, if I call "cat" on a large text file, it takes almost
> twice as much time to be displayed and scrolled completely.
>
> Can you please check that the ast driver portion of that commit is
> both correct and complete?

And in the meantime, what bad will happen if we just revert the ast portion of that commit?

Thanks,
--
Jean Delvare
SUSE L3 Support
_______________________________________________
dri-devel mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/dri-devel

2018-11-13 09:24:28

by Thomas Zimmermann

[permalink] [raw]
Subject: Re: Performance regression in ast drm driver

Jean,

ast doesn't remove the vesafb's framebuffer before attaching to the
device. I have a patch at [1]. If you have a way of testing it, I'd
appreciate.

Best regards
Thomas

[1] https://bugzilla.suse.com/show_bug.cgi?id=1112963

Am 12.11.18 um 17:41 schrieb Jean Delvare:
> On Mon, 2018-11-12 at 15:45 +0100, Takashi Iwai wrote:
>> On Mon, 12 Nov 2018 15:36:07 +0100,
>> Jean Delvare wrote:
>>> Takashi asked me to compare the contents of
>>> /sys/kernel/debug/x86/pat_memtype_list before and after the commit
>>> above. Before the commit, we have:
>>>
>>> uncached-minus @ 0xfafe0000-0xfb000000
>>> uncached-minus @ 0xfb000000-0xfb005000
>>> write-combining @ 0xfb005000-0xfb584000
>>>
>>> After the commit, we have:
>>>
>>> uncached-minus @ 0xfafe0000-0xfb000000
>>> uncached-minus @ 0xfb000000-0xfb005000
>>> uncached-minus @ 0xfb000000-0xfb800000
>>> uncached-minus @ 0xfb005000-0xfb584000
>>
>> Just to be sure:
>> could you double-check whether you're checking the right order
>> (i.e. not checking against the revert)? The change above looks
>> illogical from what I can see from the commit...
>
> Yes, I double checked and can only confirm what I wrote above. However
> while checking I noticed another strange thing: the contents of
> pat_memtype_list depend on whether I'm reading the file from an xterm
> or from the text console (Ctrl+Alt+F2). So I summarize again:
>
> Commit "drm/drivers: add support for using the arch wc mapping API"
> APPLIED, reading pat_memtype_list from the text console:
>
> uncached-minus @ 0xfafe0000-0xfb000000
> uncached-minus @ 0xfb000000-0xfb005000
> uncached-minus @ 0xfb000000-0xfb800000
> uncached-minus @ 0xfb005000-0xfb584000
>
> Commit "drm/drivers: add support for using the arch wc mapping API"
> APPLIED, reading pat_memtype_list from an xterm:
>
> uncached-minus @ 0xfafe0000-0xfb000000
> uncached-minus @ 0xfb000000-0xfb005000
> uncached-minus @ 0xfb000000-0xfb800000
>
> Commit "drm/drivers: add support for using the arch wc mapping API"
> REVERTED, reading pat_memtype_list from the text console:
>
> uncached-minus @ 0xfafe0000-0xfb000000
> uncached-minus @ 0xfb000000-0xfb005000
> write-combining @ 0xfb005000-0xfb584000
>
> Commit "drm/drivers: add support for using the arch wc mapping API"
> REVERTED, reading pat_memtype_list from an xterm:
>
> uncached-minus @ 0xfafe0000-0xfb000000
> uncached-minus @ 0xfb000000-0xfb005000
>
> If you don't find it logical, remember that if things were the way they
> are supposed to be, customers and myself would not be reporting this
> bug ;-)
>

--
Thomas Zimmermann
Graphics Driver Developer
SUSE Linux GmbH, Maxfeldstr. 5, D-90409 Nürnberg
Tel: +49-911-74053-0; Fax: +49-911-7417755; https://www.suse.com/
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard,
Graham Norton, HRB 21284 (AG Nürnberg)


Attachments:
signature.asc (499.00 B)
OpenPGP digital signature

2018-11-13 12:10:58

by Jean Delvare

[permalink] [raw]
Subject: Re: Performance regression in ast drm driver

Hi Thomas,

On Tue, 13 Nov 2018 10:23:45 +0100, Thomas Zimmermann wrote:
> ast doesn't remove the vesafb's framebuffer before attaching to the
> device. I have a patch at [1]. If you have a way of testing it, I'd
> appreciate.
>
> [1] https://bugzilla.suse.com/show_bug.cgi?id=1112963

Thank you very much for looking into this bug. I tested the patch above
and yes, it solves my console performance problem. The performance with
your patch applied is roughly the same as with commit "drm/drivers: add
support for using the arch wc mapping API" reverted.

--
Jean Delvare
SUSE L3 Support

2018-11-13 12:17:57

by Thomas Zimmermann

[permalink] [raw]
Subject: Re: Performance regression in ast drm driver

Hi

Am 13.11.18 um 13:08 schrieb Jean Delvare:
> Hi Thomas,
>
> On Tue, 13 Nov 2018 10:23:45 +0100, Thomas Zimmermann wrote:
>> ast doesn't remove the vesafb's framebuffer before attaching to the
>> device. I have a patch at [1]. If you have a way of testing it, I'd
>> appreciate.
>>
>> [1] https://bugzilla.suse.com/show_bug.cgi?id=1112963
>
> Thank you very much for looking into this bug. I tested the patch above
> and yes, it solves my console performance problem. The performance with
> your patch applied is roughly the same as with commit "drm/drivers: add
> support for using the arch wc mapping API" reverted.
>

Great. I'll add it to SLE and also send it upstream soonish. Let's see
what Edward reports. Thanks for building a kernel for him.

Best regards
Thomas

--
Thomas Zimmermann
Graphics Driver Developer
SUSE Linux GmbH, Maxfeldstr. 5, D-90409 Nürnberg
Tel: +49-911-74053-0; Fax: +49-911-7417755; https://www.suse.com/
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard,
Graham Norton, HRB 21284 (AG Nürnberg)


Attachments:
signature.asc (499.00 B)
OpenPGP digital signature