2010-02-18 15:30:55

by Jerome Glisse

[permalink] [raw]
Subject: Uncool feature for TTM introduced by x86, pat: Use page flags to track memtypes of RAM pages

Hi,

commit id: f58417409603d62f2eb23db4d2cf6853d84a1698

TTM is doing uncommon use of set_memory_wc|uc|wb for instance it's not
uncommon for TTM to change memory type from wc to uc or from uc to wc.
Since x86, pat: Use page flags to track memtypes of RAM pages (commit
id above) this isn't allowed anymore, before going from uc to wc or
wc to uc we first have to free the memtype by going through wb this
means an extra step which likely lead to some overhead (i guess that
uc -> wc or wc -> uc won't trigger massive tlb/cpu flush while
uc -> wb -> wc or wc -> wb -> uc will). reserve_ram_pages_type is the
function which will check that memory is wb thus enforcing us to go
through wb step.

Can we modify the interface to support again changing from uc to wc
or wc to uc ? (i can try to do a patch for that).

If no, we have a sever regression on non PAT arch :
http://bugzilla.kernel.org/show_bug.cgi?id=15328
(AFAIK bugzilla.kernel.org is down for me at the moment) because we
are doing the extra set wb step (i haven't dived deep into the code
to check what happen on non PAT). My guess is that on non PAT the
extra set wb can be avoided right ? Also what is your educated guess
on why on non PAT this extra set wb is slowing down thing badly ?
Last can we make this extra step only on PAT enabled arch ?

In PAT case right now we are calling get get_page_memtype to know
if we need to set page to wb first my understanding is that we should
protect this call by memtype_lock spinlock (even if i don't think we
can have collision here as we are protected by TTM locking), maybe
we can directly use TTM information to call or not set wb and avoid
using get_page_memtype.

Sorry for being such annoying user of PAT, i guess GPU is the only
place having such strange and intensive cache change pattern.

Cheers,
Jerome


2010-02-18 16:42:01

by Jerome Glisse

[permalink] [raw]
Subject: Re: Uncool feature for TTM introduced by x86, pat: Use page flags to track memtypes of RAM pages

On Thu, Feb 18, 2010 at 04:30:32PM +0100, Jerome Glisse wrote:
> Hi,
>
> commit id: f58417409603d62f2eb23db4d2cf6853d84a1698
>
> TTM is doing uncommon use of set_memory_wc|uc|wb for instance it's not
> uncommon for TTM to change memory type from wc to uc or from uc to wc.
> Since x86, pat: Use page flags to track memtypes of RAM pages (commit
> id above) this isn't allowed anymore, before going from uc to wc or
> wc to uc we first have to free the memtype by going through wb this
> means an extra step which likely lead to some overhead (i guess that
> uc -> wc or wc -> uc won't trigger massive tlb/cpu flush while
> uc -> wb -> wc or wc -> wb -> uc will). reserve_ram_pages_type is the
> function which will check that memory is wb thus enforcing us to go
> through wb step.
>
> Can we modify the interface to support again changing from uc to wc
> or wc to uc ? (i can try to do a patch for that).

Ok, we are likely not hit by wc -> uc or uc -> wc change (which is dumb
but i haven't yet understand what is exactly happening for the user)
My guess is that on non PAT get_page_memtype always return -1
Thus i think we will need to export a function bool pat_is_enabled() so
ttm can pickup the proper path in the unlikely/broken case of
wc -> uc.

> If no, we have a sever regression on non PAT arch :
> http://bugzilla.kernel.org/show_bug.cgi?id=15328
> (AFAIK bugzilla.kernel.org is down for me at the moment) because we
> are doing the extra set wb step (i haven't dived deep into the code
> to check what happen on non PAT). My guess is that on non PAT the
> extra set wb can be avoided right ? Also what is your educated guess
> on why on non PAT this extra set wb is slowing down thing badly ?
> Last can we make this extra step only on PAT enabled arch ?
>
> In PAT case right now we are calling get get_page_memtype to know
> if we need to set page to wb first my understanding is that we should
> protect this call by memtype_lock spinlock (even if i don't think we
> can have collision here as we are protected by TTM locking), maybe
> we can directly use TTM information to call or not set wb and avoid
> using get_page_memtype.
>
> Sorry for being such annoying user of PAT, i guess GPU is the only
> place having such strange and intensive cache change pattern.
>
Cheers,
Jerome

2010-02-18 17:27:50

by Andi Kleen

[permalink] [raw]
Subject: Re: Uncool feature for TTM introduced by x86, pat: Use page flags to track memtypes of RAM pages

Jerome Glisse <[email protected]> writes:
>
> Can we modify the interface to support again changing from uc to wc
> or wc to uc ? (i can try to do a patch for that).

At least on Intel CPUs that support self-snoop (all modern
ones) that should really be very cheap.

-Andi

--
[email protected] -- Speaking for myself only.

2010-02-18 17:42:50

by H. Peter Anvin

[permalink] [raw]
Subject: Re: Uncool feature for TTM introduced by x86, pat: Use page flags to track memtypes of RAM pages

On 02/18/2010 09:27 AM, Andi Kleen wrote:
> Jerome Glisse <[email protected]> writes:
>>
>> Can we modify the interface to support again changing from uc to wc
>> or wc to uc ? (i can try to do a patch for that).
>
> At least on Intel CPUs that support self-snoop (all modern
> ones) that should really be very cheap.
>

The UC/WC transition should be particularly trivial; I don't see any
reason it should have to go through any other procedure on *any* CPU --
selfsnoop shouldn't even figure into it, since neither UC nor WC
actually caches anything. For the WC->UC direction, all we should need
to do is to flush the write combiners; a simple wmb() will do that.

Self-snoop is about not having to flush something that can have been
cached, as is normally needed to do WB/WT/WP -> UC/WC.

-hpa

2010-02-18 18:47:13

by Suresh Siddha

[permalink] [raw]
Subject: Re: Uncool feature for TTM introduced by x86, pat: Use page flags to track memtypes of RAM pages

On Thu, 2010-02-18 at 08:35 -0800, Jerome Glisse wrote:
> On Thu, Feb 18, 2010 at 04:30:32PM +0100, Jerome Glisse wrote:
> > Hi,
> >
> > commit id: f58417409603d62f2eb23db4d2cf6853d84a1698
> >
> > TTM is doing uncommon use of set_memory_wc|uc|wb for instance it's not
> > uncommon for TTM to change memory type from wc to uc or from uc to wc.
> > Since x86, pat: Use page flags to track memtypes of RAM pages (commit
> > id above) this isn't allowed anymore, before going from uc to wc or
> > wc to uc we first have to free the memtype by going through wb this
> > means an extra step which likely lead to some overhead (i guess that
> > uc -> wc or wc -> uc won't trigger massive tlb/cpu flush while
> > uc -> wb -> wc or wc -> wb -> uc will). reserve_ram_pages_type is the
> > function which will check that memory is wb thus enforcing us to go
> > through wb step.
> >
> > Can we modify the interface to support again changing from uc to wc
> > or wc to uc ? (i can try to do a patch for that).
>
> Ok, we are likely not hit by wc -> uc or uc -> wc change (which is dumb
> but i haven't yet understand what is exactly happening for the user)
> My guess is that on non PAT get_page_memtype always return -1
> Thus i think we will need to export a function bool pat_is_enabled() so
> ttm can pickup the proper path in the unlikely/broken case of
> wc -> uc.

Sure. We can definitely look into extending PAT API if there is a need.

But it looks like you have a bug in the commit
db78e27de7e29a6db6be7caf607cf803d84094aa causing this performance
regression.

> > If no, we have a sever regression on non PAT arch :
> > http://bugzilla.kernel.org/show_bug.cgi?id=15328

This is the commit db78e27de7e29a6db6be7caf607cf803d84094aa:

- switch (c_state) {
- case tt_cached:
- return set_pages_wb(p, 1);
- case tt_wc:
- return set_memory_wc((unsigned long) page_address(p), 1);
- default:
- return set_pages_uc(p, 1);
+ if (get_page_memtype(p) != -1) {
+ /* p isn't in the default caching state, set it to
+ * writeback first to free its current memtype. */
+
+ ret = set_pages_wb(p, 1);
+ if (ret)
+ return ret;
}
+
+ if (c_state == tt_wc)
+ ret = set_memory_wc((unsigned long) page_address(p), 1);
+ else if (c_state == tt_uncached)
+ ret = set_pages_uc(p, 1);
+
+ return ret;

You are not setting the pages back to wb() before you free it. And this
is the reason for your performance issue.

We can fix it by changing the if check to something like:

if (get_page_memtype(p) != -1 || c_state == tt_cached)

thanks,
suresh


2010-02-18 20:28:03

by Andi Kleen

[permalink] [raw]
Subject: Re: Uncool feature for TTM introduced by x86, pat: Use page flags to track memtypes of RAM pages

On Thu, Feb 18, 2010 at 09:38:01AM -0800, H. Peter Anvin wrote:
> On 02/18/2010 09:27 AM, Andi Kleen wrote:
> > Jerome Glisse <[email protected]> writes:
> >>
> >> Can we modify the interface to support again changing from uc to wc
> >> or wc to uc ? (i can try to do a patch for that).
> >
> > At least on Intel CPUs that support self-snoop (all modern
> > ones) that should really be very cheap.
> >
>
> The UC/WC transition should be particularly trivial; I don't see any
> reason it should have to go through any other procedure on *any* CPU --
> selfsnoop shouldn't even figure into it, since neither UC nor WC
> actually caches anything. For the WC->UC direction, all we should need
> to do is to flush the write combiners; a simple wmb() will do that.

I'm not sure that is sanctioned by the SDM rules; AFAIK they don't
make any special exception for this case.

-Andi
--
[email protected] -- Speaking for myself only.